============================================================================== SOME TECHNICAL NOTES ABOUT IEC 61131-3 standard, and this implementation ============================================================================== 1. Assignment of variables in Graphical Pous The standard is not clear about this: should the assignment of an output to an variable in FBD take place either as an individual assign statement (which has it own "execution order") or rather atomically, as part of a function/FB invocation? Consider for example: 1.a _________ ______ | FB1 | | FUNC2 | | OUT1|--------|G | A-----|IN OUT2|-- B--|H |---C |_________| | |_______| | --------------------------- B Should this be executed as three statements ? FB1(IN:=A); FUNC2(G:= FB1.OUT1,H:=B); B := FB1.OUT2; Or rather the assignments should be part of the call? FB1(IN:=A,OUT2=>B); C := FUNC2(G:= FB1.OUT1,H:=B); This is not clear - and even less if FB1.ENO=0 The second approach seems preferable, but consider these cases: 1.b _________ | FB1 | A-----|IN OUT2|-----C |_________| | ---B 1.c _________ ______ | FB1 | | FUNC2 | A-----|IN OUT2|-------|H |--C |_________| | |_______| ---B 1.d ________ | F1 | A-----|IN OUT|-- |________| | |---- B (wired or) ________ | | F2 | | C-----|IN OUT|-- |________| More problematic cases: - Can a same variable reference be used for reading and writing? - Can a variable assignment cross connectors? In our implementation we opted for the following: - Variable assignments take place as part of the function call (second approach) - A varRef node has restrictions for connections: only one input (no wired-or), it must be connected to the output of a Function/FB, and no more than one varRef for FB output - No connectors allowed for variable assignment - A varRef can only be asigned for writing Hence, diagram 1.c above is valid, 1.b and 1.d are not. ============================================================================== 2. Behaviour of EN/ENO variables IEC 61131-3 spec is not clear about several issues (and IEC 61131-8 does not shed more light) about EN/ENO: Consider: C := FUNC1( EN:= COND , IN:= A , OUT1 => B , ENO => RES ); The spec states (2.5.1.2) that "If the ENO output is evaluated to FALSE (0), the values of all function outputs (VAR_OUTPUT, VAR_IN_OUT and function result) shall be considered to be implementation-dependent." But this does not clarify the following: in this case (ENO is FALSE), should the output assignment OUT1 => B take place, or should the outside var (B) be left untouched? And what about the ENO=>RES mapping, and the assignment of C? This question applies also to graphical languages: 2.a ___________ | FUNC1 | COND----|EN ENO|---- RES A-----|IN OUT1|-----B |___________| There are several possible answers: Codesys 3S adopts (at least in graphical lang) an extreme approach: no output is mapped (nor even ENO=>RES !) when ENO is FALSE. This does not seem very sensible to us, because then the assignment ENO=>RES would never give FALSE, and there would be no way of reading the ENO value; and this contradicts the reason to introduce the ENO => RES syntax. Another answer could be: only the ENO output is mapped, the rest is untouched. But this sounds rather inconsistent and awkward, specially considering the C := FUNC1() assignment. And what about this other one? D := X + FUNC1( EN:= COND , IN:= A , OUT1 => B , ENO => RES ); Our implementation: If EN is false, the only formal output assigment is for the ENO value . The informal output assignenment is performed (but its value should be considered arbitrary). In more detail: The execution of a Function/FB comprises these steps: 1. The inputs args are assigned 2. ENO := EN 3. The body is executed if EN=1 (this could set ENO=0) 4.a If ENO=1, all outputs are assigned. 4.b Else (ENO=0), only ENO output is assigned, as also the informal output (return value of functions) (steps 2-3 are said to happen "inside the function", the rest happen in the call) Notice that in 4.b the outputs might still be read later from the caller scope, both by a FB.X reference or by a function assignment D := FUNC1() but in these cases the value would be bogus/undefined Notice also that this is very related with or definition for the "Assignment of variables in Graphical Langs" issue ============================================================================== 3. EN/ENO are implicitly defined or not? When we create a POU , are the control variables EN/ENO already implicitly defined or should we include them among our variables definitions? The standard is not clear about this. Our implementation: Every FUNCTION or FUNCTION_BLOCK (user defined and standard) implicitly includes the EN/ENO pair. The graphical editor allows to hide/display them. (The compiler might choose to remove them if they are not used) ============================================================================== 4. VAR_IN_OUT variables 61131 Part 8 states that a in-out variable in a POU does not correspond to local storage, but references instead an "outside" variable (ie from the caller scope) which is not copied to a local variable. (like a pass-by-reference semantic). Our current implementation does actually a copy of the original variable, and another at the end. The resulting semantic is exactly the same, though there is a performance hit. Actually, a similar remark can be done with respect to VAR_INPUT: However: how to implement this (IN_OUT/INPUT do not have local storage) considering that the inputs are always optional in formal calls, and can have initialization? (see fig 6.a) The only feasible way seems to have a local variable with local storage, and use that if not argument was passed, either use the remote if it was passed. But this negates much of the performance difference with respect with just copying the passed value. Furthermore, IN_OUT variables does not mix nicely with graphical languages. Refer to 61131-3 2.5.1.1, table 5 and consider this (assume here that V is IN-OUT and INC increments it by 1) 4.a _________ | INC | X-----|V | |_________| 4.b _________ | INC | X-----|V V|----X |_________| 4.c _________ | INC | X-----|V V|----Y |_________| 4.d _________ _______ | INC | | F1 | X-----|V V|----| | |_________| |_______| Questions: Should the in/out variable be displayed on the right side or not? In what cases should the value of X result incremented? How would these examples translate to ST ? If we intepret it as INC(V:=X), then the first right connection is redundant and confusing. If not, then the variable V is just an internal variable that can be read and written; and X is not passed by reference, but by value. And what to do with 4.c? Actually, it would be impossible to express 4.c and 4.d in textual languages. To mitigate these confusions we dont allow IN_OUT to be mapped on the right, only on the left. (diagram 1) (and only a single varref must be connected!) This can still be confusing: it might not be apparent that, in diagram 1, the X variable is modified. But this is not much more evident in the INC(V:=X) syntax. ============================================================================== 5. Functions vs FB in graphical lang The standard assumes that functions are fundamentally different from FB in that functions "don't have a permanent storage associated" to them. Hence, one cannot refer to "instances" of functions calls. But, again, this concept does not fit smoothly in FBD. What about this? _____ 5.a | F1 | _________ -- |_____| | FUNC | | X-----|IN OUT|----| |_________| | _____ ---| F2 | |_____| Either we prohibit that an output from a function is connected to more than one element (the standard is silent about this; and such a prohibition would feel awkward; see in example F.6.7 ("Function block LIMITS_ALARM") the output of the "/" function call)... ...or we are practically forced to have an instance of the function call FUNC(X), just like FB (or at least create some storage for its outputs) Furthermore, such a simple diagram cannot be expressed legally in ST! Because of this, we internally create an instance of every function call (except binary/unary expressions) as a hidden variable. ============================================================================== 6. Data types, strong typing and generic functions The standard is not clear about data types implicit conversions. "Strong typing" seems assumed in 61131-3, and it's further explicited in 61131-8 (3.1.9 : "Assignment fo the result of evaluating an expression to a variable [should] be 'strongly typed'") This seems to imply that, eg, this is invalid (compilation error) VAR I1,I2: INT; U1,U2: UINT; D1,D2: DINT; END_VAR D2 := I1; Actually this is too cumbersome -and very few high level languages (if any) are so strongly typed- and would require lots of explicit conversions (which, to make matters worse, are plain function calls). Besides being too strong, the typing system has ambiguities when generic datatypes are introduced. It sounds sensible to declare a generic type ANY_INT meaning "any int-like type", at the signatures of the functions, but soon confusion arises. Consider this basic signature: FUNCTION ADD(ANY_INT IN1, ANY_INT IN2) : ANY_INT First doubt: Are the two arguments required to be the same concrete type or not? If not, what would the return type of ADD(I1,D1)? And U2 := ADD(I1,U1) ? Is D2 := ADD(I1,D1) valid or not? We could restrict this kind of function to having the same type for all arguments, and that this type determines the return type; but, besides being too restrictive, this is not satisfactory for some standard functions (eg: MID) In some cases (eg Table 27 footnote d) for some functions it's specified that "It is an error if the inputs and the outputs to one of these functions are not all of the same actual data type" But, again, it's not specified if the compiler is supposed to catch this error or not. It's not satisfactory to state that this is a "implementation dependent" behaviour, when not a hint is given about how this can be implemented in a sensible manner. Furthermore: That the function returns "ANY_INT" means that the concrete type is resolved at compile time or not? If yes, how? If not, then how can we validate the assignment? Furthermore: how would a compiler implement the calling, if it cannot know (at compile time!) the concrete type of its arguments? More related doubts: what about integer unqualified literals? Are they to be interpreted as ANY_INT or as INT ? If the first, the compiler would have a hard work to decide which concrete function to call (think of ADD(200 - MUL(100*(-3))) ) If the second, it would be very cumbersome to disallow statements like S1 := 2; And at the same time, this very high strong typing (bordering the absurd) clashes with the allowed use of literal 1 as meaning TRUE... All of this looks ill-specified, probably ill designed. No wonder that most implementations does not follow the standard here. Our implementation opts for a weaker data typing, slightly weaker than that of C or Java: a. No conversion/casting is needed for widening conversions, inside each datatype family: {BIT BYTE WORD DWORD LWORD} {SINT INT DINT LINT} {USINT UINT UDINT ULINT} {REAL LREAL} (eg: a INT can be directly used where a DINT is expected) ("vertical casting") b. No conversion/casting is needed for integer/bit types of same width: (eg: a INT can be directly used where a UINT is expected) ("horizontal casting") c. Both points above imply that also a INT can be used where a UDINT was expected ("diagonal casting") d. Generic functions are considered as mere alias to concrete functions, which are resolved at compile time. These concrete functions can also be used directly Eg: ADD( I1, I2) is internally converted to ADD_INT(I1,I2) In case of mixed datatypes arguments, the compiler automatically figures out the best compatible legal casting, according to the rules above. Eg ADD( U1, DI2) is internally converted to ADD_DINT( U1,DI2) This also implies that U2 := U1 + DI2 is illegal (the sum returns a DINT, which cannot be automaticaly downcasted to UINT), but DU2 := U1 + DI2 is legal. e. Unqualified integer literals are treated as integers of unknown width. This is deduced at compile time, according to the context, and the range is checked then. Eg: ADD( I1, 3) is converted to ADD_INT(I1 , INT#3) I1:= 50000; is converted to I1:= INT#50000; which gives an error (out of range) When a generic function is called with only unqualified numeric the concrete type cannot be safely determined, and a default type is used (INT, WORD, REAL). There is no need to rely on this, so you are advised to remove such constructs: Eg: instead of ADD( 2, 3) force the desired type by using ADD_INT(2,3) or ADD(INT#2,3) (or, rather, why not type 5 directly?) ============================================================================== 7. IL: Accumulator How is the type checking done with an accumulator that can hold any datatype? Further, how can the compiler know which concrete function to invoke for a plain statement like ABS? Is the compiler supposed to track all the execution branches to do type checking? That would extremely complex. Is all runtime? Then, the system would need some type of dynamic linking (resolve the real function to be called in runtime) which would also be complex and inefficient. Our approach: type checking is done at compile time. This is simple and efficient, but at the expense of prohibiting some constructs which would lead to type ambiguities: you must use ABS_INT, ABS_DINT, etc instead of ABS And numeric literals should be type-qualified, unless using the "basic" types (INT, REAL) ============================================================================== 8. IL: Multivalued types It's not clear how to deal with arbitrary multivalued types (arrays, structs) in the accumulator. Our approach: only primitive scalar types are allowed in IL in the accumulator ============================================================================== 9. LD vs FBD The differences and the restrictions are not clear. Can we mix Coils/Contacts with functions? Can we wired-or boolean outputs in FBD? Our implementation is fairly permissive, and has litte differences between LD and FBD. - LD has left-right rails - The rules for execution order are slightly different. In both cases, nodes are grouped in "networks" (or "rungs"), with the same criteria (graphically connected nodes, excluding rails, and including pass-trough-connectors). But inside each network, the order of execution is determined mainly by the position in LD (top to bottom, left to right), and mainly by topology (connections) in FBD - Wired-or connections should not be used in FBD - Coils/contacts should be used only in LD ============================================================================== 10. Execution order and recursion in LD/FBD Recursion (feedback) in graphical languages is potentially dangerous, and leads to ambiguous or difficult to compile scenarios (specially if generic functions are chained, it can be difficult to determine the concrete type to be used). The execution order with feedback can be confusing or ambiguous. Furthermore, to use feedback from output of functions is not consistent with functions having no permanent storage. On the other side, some limited feedback is often clean and very useful - almost necessary. Our implementation: We allow feedback in graphical languages, but disallow feedback connections that start from a Function. A connection is considered "feedback" when its source node has an execution order greater than the target node. The execution order is determined mainly from the nodes positions in LD (top to bottom, left to right) inside each "rung" or "network", and mainly from the connection topology in FBD. ============================================================================== 11. Generic function names In our implementation, a "generic function" like MUL is actually several different functions, depending on the concrete type. We resolve this mapping quite early, at compile time, so that the function MUL does not really exists, it's just an alias that is resolved when doing type checking. So, for example, if the arguments are INT, MUL(X,Y) is interpreted as an alias to MUL_INT(X,Y) This approach makes the compiler more robust, and errors are catched early (before C generation, or run). Some caveats: because a FUNCTION has a local variable with the same name of the function, the above applies too: there is no "MUL" local variable. This is monstly transparent for the user, but might be slightly confusing when debugging, or when assigning the function output thus: MUL(IN1:X,IN2:= 2, MUL=>Z); We allow that, internally the formal parameter "MUL" is also rewriten as MUL_INT (or the corresponding type). This rewrite is also applied in FBD, where the output name of the function can be omitted (we also allow the use of the special name "OUT", as some compilers -eg Beremiz- do) ============================================================================== 12. Strings / Wstrings We currently don't implement WSTRING. Nor we implement a fixed size string. In our implementation a STRING has arbitrary size. ================================================================================ 13. TYPE blocks TYPE blocks can be defined inside a ST POU or outside it, in its own file. In any case, its scope is global. ================================================================================ 14. Anonymous ARRAYS/STRUCTS and initialization Anonymous ARRAYS/STRUCTs (user defined types, specifically for a POU variable, outside the TYPE block) are allowed. But init expressions for ARRAY/STRUCT are only allowed for DERIVED (named) ARRAYS/STRUCTS (defined in TYPE block) ================================================================================ 15. BOOL type: conversions, operators Several confusions might arise when converting to BOOL types, because of two confusing features of the standard: a. The BOOL type is placed inside the ANY_BIT datatype family; this suggests that it should behave the same as BYTE, WORD, etc, only with 1-bit width. But this seems to ignore the elementary distiction that most programming languages make between bitwise and logical operations. For example: Table 52 states that NOT is a "Logical negation (one's complement)" ; but a one's complement is a bitwise negation, not a logical negation. If one defines W : WORD := 1, its NOT will be 0xFFFE. The same must be said of the "Logical AND" , it's actually a bitwise AND. Furthermore: Should a statement requiring a boolean type (eg: WHILE) accept those "logical" functions when applied to ANY_BIT (non BOOL) operators? Consider WHILE( W1 AND W2) Is that a valid expression ? It seems that it's not, if the language doesnt' allow implicit "downcasting" - but then, again, that AND is not "logical". Now, we could want use the explicited forced downcasting via a conversion function as WHILE(WORD_TO_BOOL(W1 AND W2)) but then, what should WORD_TO_BOOL do? Should it narrow the datatype to the least significant bit (in analogy with the other downcasts) or should it give "1(TRUE)" if the WORD is nonzero? b. Similar confusions arise with conversions from numeric values to BOOL, specially considering the accepted literals 0=FALSE 1=TRUE What should INT_TO_BOOL(2) return? Our implementation: All conversions from numeric and bit types to BOOL return 0(FALSE) if and only if the original value is zero. This is consistent with what most programmers expect - but is not consistent with other downcasts in the ANY_BIT familiy. WORD_TO_BYTE(0x1FE) => 0xFE WORD_TO_BOOL(0x1FE) => 1 ================================================================================ 16. AT (Directly represented) variables syntax a. AT variables cannot be anonymous and must be referenced by the alias VAR X8 AT %I8 : BOOL; END_VAR => correct VAR AT %I8 : BOOL; END_VAR => incorrect Y := NOT(X8); => correct Y := NOT(%I8); => incorrect This restriction might be removed in future releases (RESTRICTION REMOVED IN 2.0.12) b. The name follows the norm: %[Direction][Width][Address] where Direction= 'I' (input) 'Q' (output) 'M' (memory=input/output) Width = 'X' (1 bit) 'B' (byte: 8 bits) 'W' (word: 16 bits) 'D' (double word: 32 bits) We accept 'A' (analogous) as equivalent to 'W' We accept any other letters in addition (or instead) of the width. They are ignored. If the second character is not one of the accepted widths ([XBWAD]) the width it defaults to 1 The address must groups of dot-separated digits (up to 4 groups) Leading zeroes in address are prohibited We don't accept the placeholder '*' (but read about range spec in the device configuration file) We don't accept 64 bits width ('L'). c. The datatype of the corresponding datatype must be one of the ANY_INT/ANY_BIT family, with the exact corresponding width. Examples of correct declarations: I8 AT %I8 : BOOL; XXX AT %IW8 : WORD; QQ AT %QW8.4.12 : UINT; I3 AT $IB14.3.4.123233 : BYTE; d. The range of mapped variables is specified in device.properties file using ranges: %MW53.1-16 => match %MW53.1 , %MW53.2 ... %MW53.16 ======================================= 17. REFERENCE extension Since 2.0.12 we implement a REFERENCE extension. A variable can be declared as REFERENCE to any primitive type and assigned in runtime. Eg: R1 : REFERENCE TO INT; In this exaple, R1 is an (initially unassinged) alias to a INT variable. The special assignment R1 REF= X; makes R1 to behave as an alias to the INT variable X. It is also possible to reference a member of a copmlex variable, eg: A1 : ARRAY [ 1 .. 3 ] OF INT; R1 REF= A1[2]; R1 := 10; // this is equivalent to A1[2] := 10; =========== 18. Should values of INPUT variables of a Function block persist among invocations? The standard is not clear: "All the values of the output variables and the necessary(SIC) internal variables of this data structure shall persist from one execution of the function block to the next..." (2.5.2) This seems to suggest that inputs are not persisted, like TEMP vars. But, on the other hand, if the input variable can be accesed (set) from the outside before the call, it must store the value, and it cannot be reset it with the default initialization (like a TEMP variable) at the start of the function body execution. Consider: fb(IN1 := 3, IN2:=10); ... fb.IN2 := 50; .... fb(IN1:=4); Recall that this includes the EN variable. The ambiguity gets worse when the INPUT variable has an initialization value. Should that value be applied every time the argument is omitted in the function call, or only before the first run? We adopt what seems most logical to us: inputs ARE persisted between calls, they are initialized only on the first run. This also applies to functions, but this is not problematic (it applies only to functions instances, that are created hidden associated to each funcion CALL - think of a FBD block) These cases are problematic though: - Function/FB that modify INPUT vars in their bodies (they should be prohibited) - Unassigned IN_OUT variables (idem) ====================================================== Last update: July 2013 http://gebautomation.com/