- J versions: 9.4.2,
Found anything wrong? File an issue at https://github.com/bugsbugsbux/understanding-j/issues.
An introduction to the J programming language that gets to the point.
It is intended for those with (some) programming experience, but others should be mostly fine after looking up some basic programming terms like function, argument, class, instance, inheritance, statement, expression, etc.
Don't treat this as a reference: Section titles do not introduce an exhaustive explanation of a certain topic, but serve to give this document a rough structure. Individual sections are not intended to be read in isolation from the others and assume the knowledge of previous sections.
Run the examples and read the comments!
If you have J installed you can open this file in JQt via the file-selection dialog which opens with ctrl+o (make sure to set the filetype-filter to "all"). Click on a line with J code and press ctrl+enter to execute it.
Covered builtins are listed in the appendix with a short description.
Important links:
- Project Homepage: https://jsoftware.com
- Online J interpreter: https://jsoftware.github.io/j-playground/bin/html2/
- All builtin operators, with links to their wiki pages: https://code.jsoftware.com/wiki/NuVoc
- Good old wiki: https://www.jsoftware.com/help/dictionary/contents.htm
- Cheatsheet/Reference Card (not for beginners): https://code.jsoftware.com/wiki/File:B.A4.pdf
J was first released in 1990 as a successor to APL, an alternative mathematical notation that is computer-executable and works well with multi-dimensional array data. Most notably J switches from APL's custom symbol-set to ASCII only, calling most basic builtins by a single symbol or a symbol with appended dot or colon, and giving distinct meaning to single symbols that usually appear in pairs like various braces and quotes ([]"{} etc).
J has many number-notations; the most important are:
The notation is simple: just put elements next to each other. All elements must to have the same type but as an empty list has no elements it can be combined with any other list. True and false are represented simply by the two numbers 1 and 0. See also: strings
Strings are lists thus an empty string is commonly used as empty list notation. The default string type is called "literal" and is UTF-8 encoded which means the first 128 ASCII characters work well with arrays as they are a single byte long. String literals are singlequoted, and only this character is special; it is escaped by doubling it. See also: string representation and unicode
Nouns are data values such as the ones covered until now. Boxes, which will be covered later, are nouns too. Lists are actually just a basic case of arrays which can be thought of as nested lists.
Classes and their instances aren't nouns: they cannot be referenced directly; but their name can be saved as a string.
Functions aren't nouns either; but by putting them into a special kind of list, gerunds, they can effectively be made one (also covered below).
J comes with some specialized data types, which wont be covered here, but shall at least be mentioned:
- Symbols: They are a specialized type of string, which is immutable, treated as a single element and has optimized performance with searching, sorting and comparison operations. However, only a few, often used ones should be created, as they are registered in a global table and cannot be removed from it anymore. More info at: https://code.jsoftware.com/wiki/Vocabulary/sco https://www.jsoftware.com/help/dictionary/dsco.htm
- Sparse arrays: If a sizable array (for example a list; see: arrays) consists of mostly the same repeated value, its size can be reduced by only saving the values which are not this so called sparse-element. Contrary to regular arrays, sparse arrays are displayed as a table of index-value pairs of the non-sparse-elements and need to be queried explicitly for their sparse-element. Computations with them are envisioned as equivalent to regular arrays but this is not fully implemented (for example, sparse arrays cannot be boxed (see: boxes) and strings (which are arrays) cannot be sparse); the results are usually sparse arrays too, however, the sparse-element might differ! More info at: https://www.jsoftware.com/help/dictionary/d211.htm
Verbs are functions that take nouns as their argument/s and return nouns. A verb has access to its left argument via the local variable x and to its right argument via the local variable y. If there are arguments on both sides of a function it is called dyadic (such verbs are dyads) otherwise the function is monadic (and such verbs are called monads; not related to the haskell term).
Note that the monadic and dyadic case here are two distinct functions: negate-number and subtract. They share a name (the symbol -) and a definition: the function - is ambivalent. A function does not have to have both cases.
Modifiers are functions used to pre-process a statement/expression; this means they run before the main evaluation step. Modifiers return a new function that has, additionally to its own arguments x and y, access to the original arguments as variables u (left, m may be used instead to indicate a noun) and v (right, use n instead to indicate a noun). The new function may return any entity, even more modifiers, which would also be processed before the first verb evaluates!
The last example showed that +/y is not the same as +y but rather y1 + y2 + y3 .... This is because / is an adverb (a modifier that takes one argument to the left): It creates a new function that inserts the original argument + between all elements of its own argument 1 2 3.
Conjunctions only differ insofar as they (meaning the original modifiers not the returned entities) take two arguments. For example & takes a dyad and a noun and creates a new verb which always uses the given noun as one of its arguments and the argument to the newly created monad as the other argument:
Assignments use the operators =: (global) and =. (function-local). They return their values, but the interpreter does not display them, so often monads [ or ], which return their argument unchanged, are used as prefix. When assigning to multiple variables the return value is still the original argument.
It is possible to assign multiple values at the same time, but the value returned is the unchanged original argument.
See also: importing code
To run the code from some file use one of these convenient verbs:
- load: Runs the specified file or shortname (list available shortnames with scripts'').
- loadd: Like load but displays the lines before executing them.
- require: Like load but files that were already loaded won't be loaded again.
As these verbs, like all functions, have their own scope, assignments in the loaded file/s which use =. but target the current global scope get captured by the loading function's scope and thus are not available afterwards!
Explicit-definitions are created with the : conjunction which returns an entity of the type indicated by the integer to its left:
- 0 for nouns,
- 1 for adverbs,
- 2 for conjunctions,
- 3 for monadic (optionally ambivalent) verbs,
- 4 for dyads.
The entities' value is specified as the right argument and is a string for functions. A value of 0 always means to read in the following lines as a string - until the next line containing ) as its only printable character.
As demonstrated, explicit definitions specify the type to create as a number. This pattern of selecting functionality with a numeric index is sort of J's guilty pleasure. In most cases one would look up the functions in the docs and assign an alias to often used ones; in fact, J already comes with a set of aliases (and other helpers): list them with names_z_''. New users should definitely look over the docs for !: (foreign function index) and o. (circle functions) to see what's available: https://code.jsoftware.com/wiki/Vocabulary/Foreigns https://code.jsoftware.com/wiki/Vocabulary/odot#dyadic
Direct definitions, DDs for short, are another way to write explicit definitions. They are wrapped in double-braces and assume their type from the argument-variable-names used:
- If variables v or n are used, a conjunction,
- otherwise, if u or m are used, an adverb,
- otherwise, if x is used, a dyad or ambivalent verb is created.
A type may also be forced by appending ) and one of the letters noun, adverb, conjunction, monad or dyad to the opening braces.
The following examples contain more info about DDs and should be compared carefully with their actual output!
An array is a ((list of) list/s of) value/s. Thus even single values (scalars/atoms) are actually arrays. All elements must have the same type, and on the same nesting-level (dimension/axis) every element has to have the same length.
Therefore an array can be described by its type and shape, which is the list of the lengths of elements on each nesting-level. The length of the shape is called rank.
The container-type box is useful to get around the restrictions of arrays: A boxed value always appears as scalar of type box; therefore any values, each in their own box, can be put into the same array!
A box not containing another box may be called leaf when talking about trees, structures consisting of nested boxes. J comes with special functions to simplify working with trees, such as {:: to index them (see: indexing).
J has all common control-structures and -words, however, they can only be used within explicit functions (explicit definitions or DDs). They are not idiomatic J and hinder new users from learning to think in array programming style (see: idiomatic replacements for explicit control-structures). Nevertheless, some problems are simpler to solve this way.
- Assert: All atoms have to be 1.
{{ assert. 'aa'='aa' NB. dyad = compares corresponding atoms: 1 1 assert. 'aa'='ab' NB. error because not all 1s: 1 0 }}''
- Conditionals: First atom must not be 0.
{{ if. 0 1 do. NB. first atom is 0 -> false echo 'if block' elseif. 1 0 do. NB. first atom is not 0 -> true echo 'elseif block' elseif. 1 do. NB. only considered if no previous block ran echo 'elseif 2 block' else. NB. runs if no previous block ran echo 'else-block' end. }}'' NB. therefore be careful with tests such as: {{ if 'bar' = 'baz' do. echo 'true' else. echo 'false' end. }}''
- Select: It first boxes unboxed arguments to select., fcase.
orcase. then compares whether one of the boxes passed to select.
is the same as a box passed to a case. or fcase. statement and
executes the respective block. An empty case. condition always
matches. After the evaluation of a case. block execution jumps to
end., while after executing an fcase. (fallthrough) block the
following case. or fcase. runs unconditionally.
match =: {{ select. y case. 'baz' do. echo 'does not match ''bar'' due to both being boxed' case. 'bar' do. echo 'after case. jumps to end.' case. 1 2 3 do. echo 'due to the boxing only matches the list 1 2 3' case. 1; 2; 3 do. echo 'box it yourself to get the desired cases' case. 4;5 do. echo 'one select.-box matching one f/case.-box suffices' fcase. 'fizz' do. echo 'after fcase. always executes next f/case.' fcase. 'buzz' do. echo 'fcase. stacks dont have to start at the top.' case. 'nomatch' do. echo 'no match but triggered by preceding fcase.' case. do. echo '... empty case. always matches' end. }} echo '---' match 'bar' match 1 2 3 match 1 match <2 NB. won't get boxed a second time match 4; 6 match 4; 4 NB. despite several matches only runs block once match 'fizz' match 'buzz' match 'shouldn''t match but...'
- While loops: Only run if first atom is not 0.
{{ foo =: 3 while. (foo > 0), 0 NB. dyad > true if x greater than y do. echo foo =: foo -1 end. }} ''
- Whilst loops: are while loops that always run at least once.
{{ whilst. 0 1 do. echo 'runs at least once' end. }} ''
- For (-each) loops: There are no classic for loops. Iterating over a
list of integers can mimic one. If a for-loop does not terminate early
its element-holding variable is set to an empty list and its
index-holding variable to the length of the list. If it terminates
early their last states are kept.
{{ for. 1 2 do. echo 'runs once per item. Neither item nor index are available...' end. }} '' fn =: {{ foo =: 'for_<foo>. hides global <foo>[_index]. with local versions' NB. see: namespaces for more info about local/global assignments foo_index =. 'for_<foo>. modifies local <foo>[_index]' for_foo. 'abc' do. echo foo; foo_index if. y do. early =. 'yes' break. else. early =. 'no' continue. NB. does not terminate loop just current iteration! end. echo 'never runs' end. echo 'terminated early?'; early; 'foo:'; foo; 'foo_index:';foo_index }} fn 1 fn 0 echo 'value of global foo:'; foo NB. unchanged
- Return statements: Exit function early. Pass a return-value as left
argument!
{{ NB. functions return last computed value <'foo' <'bar' }} '' {{ <'foo' return. NB. exits early and returns last computed value <'bar' }} '' {{ <'foo' <'fizz' return. NB. or given left arg (parens not necessary??) <'bar' }}''
- Goto: goto_MYLABEL. continues execution at the location marked: label_MYLABEL.. Consider that many (most prominently Dijkstra who advocated for structured programming) warn of goto's tendency obscure the flow of a program making it hard to maintain and debug. Goto's best usecase is probably to escape deeply nested loops, which are inherently un-idiomatic J code anyways (see: idiomatic replacements).
- Error handling: see: Errors
Verbs related to errors and debugging can be found in section 13 of the foreign function index (13!: n) but there are default aliases, which will be used below.
Errors have a number and a message, which defaults to (err - 1)th element in the list of default error messages returned by 9!:8''. dberr queries the last error's number and dberm gets its message.
To raise an error its number is passed to dbsig which optionally accepts a custom error message as left argument. (Show the correct function name instead of "dbsig" in the message by using (13!:8) directly).
When J runs in debug mode, which can be en/disabled with a boolean argument to dbr, and queried with dbq, errors in named functions pause execution and messages contain a bit more information; the program state can then be investigated and altered.
J has a throw. keyword which is equivalent to raising error 55 with dbsig and behaves differently from other errors: throw.ing immediately exits the current function and goes up the callstack until it finds a caller which used a try. block with a catcht. clause, which is then executed. If no catcht. is found an "uncaught throw." error is raised.
The rank of a noun is the length of its shape. Note that the shape of a scalar/atom is an empty list, thus a scalar's rank is 0; as well as that the number of elements on the lowest dimension (atoms) is the last number in the shape:
Every verb defines a maximum allowed rank for its arguments; this is called the "rank of a verb". As a verb may have a monadic and a dyadic version and a dyad has two arguments, "the" rank of a verb is actually a list of the three ranks in the following order: monadic, dyadic-left and dyadic-right rank.
If an argument has a higher rank than is allowed it is broken into pieces of this rank called "cells". The verb then operates on each of this cells individually and finally all results are returned.
Deducing the shape of the cells in these examples is simple; comparing them with the shape of the original argument reveals that the shape of cells is a (possibly empty) suffix of the argument's shape and has the length of the rank of the verb. The example operating on atoms also showed that the final shape is not simply a (one-dimensional) list of the individual results: Rather, it is the individual results arranged in a so called "frame", which is the (possibly empty) leading part of the argument's shape, that was not used to determine the shape of the cells.
While dyads follow essentially the same rules, as they process two cells at the same time, these have to be paired up according to some rules: Generally, every left argument is paired with the right argument that corresponds to it by index. If this is not possible an error is thrown. Therefore, dyads do not pair each left with every right cell, which is what applying adverb / to a dyad does -- even though sometimes it looks like it due to the following exception: When there is a single cell on a side, it is paired with every cell from the other side (individually).
Dyads, too, arrange the individual results in a frame determined by the argument. However, as there are two arguments and thus two frames, these may not contradict each other, which they do not when they are equal or one simply omits the last elements of the other:
When one frame is longer than the other, the shorter frame is also used to group the cells of both arguments: On the side of the shorter frame these groups will always contain a single cell, which is then paired with each cell from the corresponding group on the other side individually. Finally, the results of each group are arranged by the difference of the frames, while the groups are arranged by the shorter (common) frame.
Positive verb-ranks, such as the ones used until now, select verb-rank-amount of trailing dimensions to be used as cells for the verb to operate on. Negative ranks, however, select leading dimensions, or in other words, the frame.
Rank, even the default one, should always be considered as the application of a modifier -- and therefore as creating a new function which handles the invocation of the verb with the correct arguments. It is not simply a property of the verb that could be altered at will. Because of this, verb rank can never be increased: Every rank-conjunction can only work with the arguments it receives (possibly by a previous rank-operator) and therefore can only further constrain the rank.
Incompatible values can still be put into the same structure by using boxes. When they are of the same type, lower dimensional values can also be padded to match the shape of the higher dimensional value and thus the boxes can be omitted. This is different from reshaping!
Unboxing automatically pads values. Behind the scenes this happens all the time when assembling subresults into a frame.
After reading the previous sections the following rules should not be surprising:
-
Evaluation works right to left and in two steps:
- Evaluate modifiers until only nouns and verbs are left,
- then evaluate those.
Right to left does not simply mean reading the words in reverse order but to find the rightmost verb or modifier (depending on current step), then determining its arguments and replacing all of that with its evaluation result. This repeats until the line is processed entirely.
- -/ 1 2 3 NB. 1. modifiers: find rightmost: / it's arg is - - 1 - 2 - 3 NB. 1a. result -fn 1 2 3 but expressed as its effect - 1 - _1 NB. 2. verbs: rightmost - has args 2 and 3 result _1 - 2 NB. rightmost verb - has args 1 and _1 result 2 _2 NB. rightmost verb - only has right arg: 2 result _2 0 1 + - 2 3 NB. this is (0 1 + (- 2 3)) -
No mathematical operator precedence, just right to left evaluation and parentheses.
2 * 3 + 4 NB. 14 (2 * 3) + 4 NB. 10 -
Parentheses form subexpressions, that complete before the parent.
(3-2) - 1 NB. subexpr completes first and returns result: 1 1 (2) 3 NB. if this was not an error (2) were the number 2 1, (2), 3 NB. but it is a subexpr not a noun thus , is needed (- +) 1 2 3 NB. (-+) is not simply -+ but creates a (see:) train - + 1 2 3 NB. trains are not equivalent to unparentesized expr -
Consecutive modifiers are processed left to right:
1 - ~ 3 NB. adverb ~ swaps the args of the verb it creates -~ 3 NB. or copies the right arg to the left join0 =: ,"0 NB. join atoms; technically already has a modifier " 'AB' join0 'CD' NB. but let's treat this as just some rank 0 verb 'AB' join0 ~ / 'CD' NB. first creates a function invoking join0 with swapped args NB. then combines each x- with each y-element using this new function 'AB' join0 / ~ 'CD' NB. first creates a function that combines each x- with each y-element NB. using dyad join0; then swaps the args to this new function -
Nouns are evaluated immediately, verbs only when called:
mynoun =: unknown + 1 NB. error; evaluates expr to get noun myverb =: 3 : 'unknown + 1' NB. ok because not yet evaluated myverb'' NB. error
An isolated sequence of verbs (in other words: multiple verbs not followed by a noun) creates a train, that is a special pattern defining which verbs receive what arguments and execute in which order. The basic idea is called fork and best explained by the computation of the arithmetic mean:
A longer train simply continues this pattern using the result of the previous fork as the right argument to the next fork. When the last fork doesn't get a left argument (because the train is of even length) it uses an argument of the train instead: its left, or if missing its right one. This is called hook-rule and when it's used the whole train is called a hook, otherwise its a fork.
Using my own terminology: A train consists of operators (the odd numbered verbs counting from right) that operate on the train's arguments, and combinators (the even numbered verbs counting from right) that are dyads combining the result of everything to their right with the result of the operator to their left.
As said before, a hook's last combinator uses an argument of the train to replace its missing left operator. When the train is dyadic, another peculiarity of hooks becomes evident: A hook's operators are always monads, ignoring the train's left argument:
Any left operator may be replaced with a noun to create a so called NVV (noun-verb-verb) fork, which simply uses this noun as its left argument:
Any left operator may be replaced with [: to create a so called capped fork which converts its combinator into a monad:
Experiment with some trains:
A gerund is a list containing (the boxed definitions of) verbs, thus creates a noun from verb/s. It can be created with conjunction ``` and applied in different ways with `:``. (see also: indexing)
Many modifiers which apply their verb multiple times may instead take a list of verbs, from which to take one per needed application of a verb. If there are too few verbs in the list, it is simply looped over again as often as needed.
Variable-names may only contain
- ASCII-letters (upper and lowercase),
- numbers (but not as first character) and
- underscores (except leading, trailing or double).
Unknown variables are assumed to be verbs because nouns are evaluated immediately when defined, so their variable parts have to exist beforehand. Verbs, however, are only evaluated when used, so unknown parts at the time of definition are ok. This also allows functions to reference themselves!
Variables create subexpressions just like parentheses do; imagine they expand to their parenthesized value:
Each name lives in a namespace also called a locale:
Every function has its own local scope, that is an unnamed namespace that cannot be inherited, thus a function defined within another function does not have access to the outer function's namespace (except using u. or v.; see: Applying modifiers to local variables).
Control-structures do not have their own namespaces. for_name-loops always use the local scope to create/update the loop variables.
The second type of namespace is the named namespace/locale, of which there may be many but at any time only one may be the current/global namespace, which is inherited by any function('s local namespace) when it is called. Inheriting means if a name is not found it is looked for in the inherited namespace/s, which for new named namespaces is "z" (where the standard helpers are defined) by default.
The examples showed that to evaluate a name in another namespace this needs to become the global namespace first, as well as that namespaces are addressed with strings. However, there is a simpler way of writing this (but behind the scenes it does the same): Locatives are names that specify a namespace either by appending its name in underscores or by appending two underscores and the name of a variable that contains the (boxed) name.
Note that in the last 4 lines the global namespace only changed during the evaluation of the locative in echo coname__ns'' but not while executing echo with the result of coname__ns''. Moreover, while echo too is a verb from another namespace, it is inherited and thus does not change the global namespace! To use a locative-accessed verb with the current global namespace use adverb f. to pull the verb into the current namespace before executing it:
Namespaces are automatically created when used.
Function-namespaces not being inheritable becomes a problem when applying modifiers to local variables:
During evaluation phase 1 (see: Evaluation Rules) modifiers produce verbs with identical body but u and v replaced with the operands of the modifier, which were not yet evaluated. Then, during phase 2, the verb encounters the unresolved name, tries to look it up and does not find it since it does not have access to the caller's namespace.
If a modifier uses u./v. instead of u/v the resulting verb starts the lookup of a name they stand for from the scope of its caller! Here is an example:
In J, classes are simply namespaces whose instances (also namespaces) inherit them. As assigning to inherited names just creates an own version of the name in the inheriting namespace, shared fields need to be accessed with a locative.
By convention conew is used to create new instances: It calls cocreate with an empty argument resulting in a new namespace with a stringified, previously unused, numeric name and then prepends the class' namespace to the list of ancestors. Finally conew remembers the namespace from which the new instance was created as the variable COCREATOR. Another convenience of the conew verb is that its dyadic version also calls the monad create with its left arguments. By convention a destructor should be implemented as the monadic function destroy calling codestroy which simply uses coerase on the current namespace.
This demonstrates creating a class inheriting from another, as well as how to use class-variables (as opposed to instance-variables), by creating a singleton, that is, a class that at any time may only have one instance:
How indices work in J:
- The first element has index 0.
- Negative indices select from the back (thus the last element can be accessed with index _1).
- Each element in an index is a separate selection (of what differs per verb).
- An unboxed list of indices selects (multiple of) the array's top-level elements, because each element is a separate selection and
- trailing axes/dimensions may be omitted, resulting in all their elements being selected.
- Top-level boxes (1st boxing level) contain paths, that is lists of selections per axis/dimension of an array. In other words: Each element in a box selects on a different axis/dimension of the array, which effectively means they subindex the previous result.
- To select several elements on the same axis/dimension group them in another box (2nd boxing level).
- Instead, elements can be excluded per axis/dimension by wrapping them with another two boxes (3rd boxing level).
- To select all elements of an axis/dimension exclude none by putting an empty array at the third boxing level (<<<'' or <<a:).
- Keep in mind that ; doesn't add a boxing level to a boxed right argument!
The verb { can index arrays but not boxes. Every element in its left argument is a new selection starting from the same dimension (that the verb was applied to).
To replace elements use the } adverb that is used like { with an additional left argument specifying the replacement/s:
To be able to write index getters that work for any array } accepts a dyad as left argument: It takes the replacement/s as left and the original array as right argument and returns the indices of a flattened version of the array at which to replace values.
Ranges can be conveniently specified with (combinations of) the following verbs:
{:: is like { but opens its results and uses each path to continue indexing into the previous result instead of returning to the starting level and producing more result values. The final result is not opened if it is not a single box or the last path's last part is a list (monad , can be used to promote a scalar to rank 1). {:: only works with paths because it will wrap lists of numbers in a box first.
As gerunds (see: gerunds) are simply lists of boxed verb-definitions, they could be indexed like any other list; however, this would not return an executable verb! Conjunction @. takes an index as right argument and returns the corresponding verb from the gerund on the left in executable form. Selecting multiple verbs returns a train (see: trains) and parentheses can be set by boxing the indices accordingly. (See also: conditional application)
The control-structures described above (see: explicit control-structures and control-words) can be replaced with more idiomatic versions which also work outside of explicit functions.
Note: Experimenting with the code in this section can easily result in non-terminating ("infinite") loops! To abort them either kill the process, which looses the session data, or call break'' in another J session. When running J from the terminal you can hit ctrl-c to break.
To apply a verb a certain amount of times (each time on the previous result), either dyadically apply the result of creating a monad from a dyad with & or use conjunction ^::
Conjunction ^: may take a verb instead of an explicit amount of repetitions, which is called with the argument/s of the construct. Note: A left argument is first bound to the repeating verb to create a monad!
Run a verb until its result stops changing (that is consecutive iterations return the same result) by using infinity as repetition count:
Running the same loop different amounts of times is simply done by providing a list of repetition counts. This can be used to return subresults (which are not boxed!):
Providing the repetition count as a box is a recognized pattern to capture the subresults. A box is interpreted as i.>n-repetitions (except when the boxed value is empty or infinity, which both create an infinite loop). Therefore, the construct only loops up to n-1 times and includes the original argument as first result! Consequently subresults must have the same type as the original argument.
A simple conditional either executes a branch or doesn't. In other words: a verb is executed 1 or 0 times. Replacing the number of repetitions with a boolean-expression is enough to create a conditional because J's booleans are simply the numbers 0 and 1:
Multiple branches can be expressed with a gerund and passed as left argument to conjunction @. (see: indexing gerunds), which calls the selected branch with the given argument/s. The noun index as right argument to @. may be replaced by a verb to generate it; this is also called with the argument/s to the construct:
To create while-loops an infinite loop can be combined with a conditional: once the condition fails its argument is returned unchanged, thus reused in the next iteration and again returned which then ends the loop, due to consecutive non-unique results.
The 6 conjunctions F., F:, F.., F.:, F:. and F:: are collectively known as Fold. Usage: x? u Fold v y
Fold creates a loop which invokes v on the previous result of v and then allows to postprocess (even discarding) the result with u. However, the result of u is never the input to v.
If the first inflection (appended "." or ":") of Fold is "." only the last result is kept, while ":" combines the subresults into a final result (which pads subresults if necessary and thus the last result might differ from a Fold with a single result).
F. and F: create an infinite loop which has to be exited explicitly. They always provide x (if given) unchanged as left argument to v and use y as a whole as initial right argument.
Dyad Z:, which may be used in u and v, provides control over Fold-created loops, similar to break. and continue. in explicit loops. Its effect is determined by its left argument and only activated if the right argument is not 0:
- _3: Raise "fold limit"-error (thus no result) if loop already ran given (as right argument) number of times. Note that _1 Z: 1 in v does not increment the iteration count and thus might create an infinite loop which is not caught by _3 Z: n...
- _2: Immediately exit the loop. u never runs or completes, thus this iteration does not contribute to the overall result. Raises "domain error" if overall result is empty.
- _1: Immediately exit current function and start the next iteration with the result of v if in u or the previous result of v if in v. u never runs or completes, thus overall result stays unchanged. Note the danger of using this in v of F. or F: (see: above).
- 0: Current iteration is not aborted, but the result of u is discarded.
- 1: Exit after finishing current iteration.
The other fold variants use x, if given, as initial right argument to v and one item of y per iteration as left argument. The items of y are used in order if the second inflection is "." and in reverse if it is ":". If x is not given, the first/last item of y is used as initial right argument and the second/-to-last item is the first left argument to v.
Conjunction :: (required leading space) allows to provide an alternative verb in case left verb fails (called with same arguments):
Conjunction ^: can be used with a gerund to preprocess the arguments:
Many (mostly monadic) builtin verbs define an inverse function to undo their effect. The inverse of these monads can be shown with argument _1 to adverb b..
An explicit function does not have an inverse by default, but one can be set with conjunction :. (required leading space!) and is then part of the verb definition which becomes, as a whole, the result when querying the inverse.
For tacit functions (functions without a body and thus namespace, that do not refer to their arguments) J tries to generate an inverse automatically; it can be corrected by explicitly assigning one with :.. J tries to convert a simple explicit body to a (similar) tacit verb when 13 is used as function type in an explicit definition.
Repeating a negative amount of times prompts J to repeat the inverse and is the intended way of explicitly calling inverses.
Functions returning the same value no matter their argument/s are called constant. Nice tacit constant functions for the ten digits, infinity and their negative variants were already introduced. Any noun can be converted to a constant function by calling the rank operator with infinity on it!
On the interactive session prompt it is easy to pipe the result of one verb into another verb: Simply put the second verb's name before the first one's. However, it seems assigning this pipeline to a name does not work (see: names), as it creates a train (see: trains). Instead, the verbs need to be composed with a conjunction:
Each composition conjunction exists as two variants: One which passes the whole argument/s to the pipeline (aka rank infinity) and ends in ":" and one whose rank (mostly) depends on the rank of the first verb of the pipeline, essentially applying the later verbs on each result returned by the first verb, instead of the collected results.
This calls a monad on the result of another monad. Note that the usage of monadic pipelines created with & or &: is discouraged as they are not recognized as performance optimized patterns!
Calling a monad on the result of a dyad:
Call a dyad on the results of the arguments individually processed by a monad like so:
&. and &.: work like & and &: (monad on monad or dyad on result-pairs of arguments processed by monads) but also apply the inverse (see: inverse and tacit functions) of the first verb as the third verb in the pipeline!
If only one side needs processing, a 2-element gerund is used with an empty box marking the side not to modify; the second verb now uses its own monadic rank on the unprocessed side:
J scripts are plaintext files, should use the extension .ijs and may have a shebang as first line (#! + the path to the J executable).
Arguments to the script are saved as boxed strings in variable ARGV_z_ in elements 3 and higher; element 2 is the path of the script (as used on the commandline) and the first and only argument which is always available is the name of the command which invoked the process (as used on the commandline; usually jconsole). Arguments to the J interpreter, which are not passed to the script, are not listed in ARGV. Thus in a script invoked like with ../j9.4/bin/jconsole -noel ./file.ijs foo 123 has ARGV -: '../j9.4/bin/jconsole';'./file.ijs';'foo';'123' while just running the interpreter from PATH interactively (jconsole) only has ARGV -: <'jconsole'.
To access environment variables use verb getenv which returns boolean 0 if it is not defined. (Note: environment variables have to be strings thus the boolean cannot be confused with a value '0'.)
Use verbs stdout and stderr to write to these file-descriptors and do not append a newline (unlike echo).
Stdin is interpreted as input to the interactive session which starts after the interpreter finished the supplied script -- unless the script prevents starting the interactive session by terminating explicitly with exit (takes the number which shall be the process' exit/return code as argument), or consumes stdin with verb stdin, which returns it as a string. Since stdin and stdout are inverses, outputting the transformation of user input can be written conveniently using &.; for example the script |. &. stdin '' reverses the input and could be executed like so: echo -n -e 'hello\nworld' | jconsole script.ijs A string can be split into boxed lines using monadic cutopen.
Note: For this section to display properly a font with support for emoji needs to be used. If in the terminal J does not accept input of non-ASCII characters start it with the -noel parameter.
J's default string type, "literal", treats 8 bits (1 byte) as an atom. This works well for the first 128 ASCII characters, which are equivalent to their UTF-8 representation and only need a single byte to be stored. Other letters are represented by more bytes in UTF-8, which makes them hard to work with. To fix this UCS-2 encodes every letter as 2 bytes, which can be interpreted as 1 atom by using J's type "unicode". However, when UTF-16 (UCS-2 superset) came one could not rely on 1 glyph being 1 "unicode" atom anymore: it may be 2. Luckily UCS-4, also called UTF-32, encodes every glyph with exactly 4 bytes, which can be treated as a single atom with J's type "unicode4". Be careful when converting to a representation with bigger atoms to use an operation which merges surrogate pairs (the multiple atoms representing a single glyph) into a single atom; when converting "unicode" to "unicode4" J would still understand it but it would not be valid UTF-32.
The verb u: provides access to unicode related functionality:
- monad u: string is 2 u: string
- monad u: number is 4 u: number
- dyad: left arg is:
- truncate to 1-byte precision
- extend or truncate to 2-byte precision
- convert to UCP (Unicode Code Point, a number)
- get UCP in UCS-2 range (<65536) as "unicode"
- like 1&u: but raise error if discarding non-zero bits
- from literal UTF-16: each 2 bytes become 1 atom of type "unicode"
- convert (literal as UTF-8) to smallest string type needed; get any UCP as "unicode"
- convert to UTF-8; get any UCP as UTF-8 encoded "literal"
- convert string (literal as UTF-8) to type "unicode4" except when all ASCII; UCPs as "unicode4" (merges surrogate pairs)
- convert each atom of string to "unicode4", therefore this does not give valid UTF-32 if surrogate pairs are present (fix with 9&u:); get any UCP as "unicode4"
"conjunct" = conjunction