Total Hits: 774937
Past 7 days: 767 hits

An Introduction to XMOS’ XC language

January 2, 2014 on 1:07 am | In Technology | No Comments

This post is based on XMOS Programming Guide and XMOS Multicore Extensions to C. I tried to summerize the contents of the 66 page document to help other people learning about XMOS processors with the StarterKit. Please note the code snippets included are not complete examples, but only the parts that illustrate how a certain feature of XC or the XMOS hardware is declared/used. Throughout these notes, I am assuming the reader is already familiar with normal C-language syntax and programming.

Note: The code snippets are not clear. If anyone knows or can recommend a WordPress plugin that handles code better, please do!

So, here it goes:

  • The XC language:
    XC is an imerative programming language based on C. XC programs are composed of multiple tasks running concurrently in parallel. The concurrent tasks manage their own state and resources and interact by passing messages to each other. To use the extensions in a project, files containing XC code must have the .xc extension. It is possible to integrate C, C++ and XC files within the same project. The build system compiles each file based on its extension.

    Some of differences between C/C++ and XC are:

    • case statements within a switch must be terminated with a break. ie: flow can’t cascade from one case to another (similar to C#)
    • XC supports optional, nullable, types: Resource types (ex: ports, timers, etc) and reference types can be made nullable. This means that a variable or function parameter can have a value or can be the special value null. The ? type operator creates a nullable type. In the following example, “paramC” may be sent as “null” when calling this function:
      void myFunction(int paramA, int paramB, int ?paramC, int paramD);

      The isnull function is used tot est whether a variable of nullable type is null or not. Ex:

      void f ( port ? p)
      if (! isnull (p )) {
      printf (" Outputting to port \n" );
      p <: 0;
    • Multiple Return Functions: Functions can return multiple values without the need for additional call-by-reference parameters or definition of structs to encapsulate the return values. ex:
      { int , int } swap ( int a , int b) {
      	return {b , a };
    • Reinterpretation: Allows wrapping/unwrapping arrays of a type into those of a larger type (array of chars to array of int). This can be useful in data transmission (such as communications using xlinks). ex:
      void transmitMsg (char msg [] , int numWords) {
      for (int i =0; i < numwords ; i ++)
      	transmitInt (( msg , int [])[i]) ;
  • Input & Output:
    The XCORE architecture provides flexible I/O ports to communicate externally. These ports have many features that enable fast I/O processing.

    • All ports must be declared as global variables.
    • Ports can be passed as function parameters.
    • Ports are declared using the port keyword.
    • An output port is declared as “out port” while an input port is declared as “in port”.
    • To output a value to a port, the <: operator is used. While to input a value from a port the :> operator is used. Ex:
      in port oneBit = XS1_PORT_1A;
      p <: 1; // output the value 1 to port p
      p <: 0; // output the value 0 to port p


    • An input operation on a port can be made to wait for one of two conditions on the port: equal to (pinseq) and not equal to (pinsneq). These functions are used in conjunction with the when predicate to form a conditional input. Ex:
      in port oneBit = XS1_PORT_1A;
      int counter=0;
      int x;
      oneBit :> x;
      while (1) {
      	oneBit when pinsneq (x) :> x;
      	counter <: ++ i;
    • The (quite powerful) select statement: When tasks are run in parallel, they execute their code independently of each other. However, during this execution they may need to react to external events from other tasks or the system environment. Tasks can react to events using the select construct which pauses the tasks and waits for an event to occur. A select can wait for several events and handles the event that occurs first.
      The syntax of the select statement is similar to that of the case statement:

      in port p1 = XS1_PORT_1A;
      in port p2 = XS1_PORT_1B;
      	select {
      		case p1 when pinseq (0x1) :> int x:
      		// handle the event here
      		break ;
      		case p2 when pinseq (0x1) :> int x:
      		// handle the event here
      	break ;

      This statement will pause until either of the events occur and then execute the code within the relevant case. Although the select waits on several events, only one of the events is handled by the statement when an event occurs. If both inputs occur at the same time, only one will be selected. The other remains ready on the next iteration of the while loop.

      case statements are not permitted to contain output operations as the XMOS architecture requires an output operation to complete but allows an input operation to wait until it sees a matching output before committing to its completion.

      Each port, timer, or other resource may appear in only one case in a select statement. The XMOS architecture restricts each resource to waiting for just one condition at a time.

    • Chapter 2 of XMOS Programming Guide concludes with an example on how to implement a UART using a single thread, and further demonstrates the power of the select statement (pages 26-29)
  • Concurrency: The bread and butter of what makes XCore so different. xC programs are comprised of tasks that run in parallel. Tasks are just code so you can
    define them as normal C functions.

    Tasks can be run in parallel from any function. However, it is only in the function main that tasks can be set up to run on multiple different xCORE cores.

    • Tasks (threads) running in parallel are declared and exist within the scope of a “par” statement. Ex:
      par {

      This statement will run task1 and task2 in parallel to completion. It will wait for both tasks to complete before carrying on. Tasks run on separate logical cores run in parallel in the hardware so there is no notion of priority or scheduling between the tasks.

    • Thread Disjointness rules: There are two simple rules to data access across threads:
      • 1) A variable can be read across threads only if all threads are reading its value only. Once a thread modifies that variable, none can access it.
      • 2) Only one thread can have access/user any single port.
      • Some examples to show the thread Disjointness rules in action (pages 32-33). Those not familiar with the concepts of threading and thread safety should read those two pages to get a better understanding of those rules.
      • Ports can to be declared on the tile which will use them, if the device has more than one tile. Ex:
        on tile[0] : out port tx = XS1_PORT_1A ;
        on tile[0] : in port rx = XS1_PORT_1B ;
        on tile[1] : out port lcdData = XS1_PORT_32A ;
        on tile[1] : in port keys = XS1_PORT_8B ;
    • There are various ways to communicate data between threads:
      • Channel Communication: Channels provide a primitive method of communication between tasks. They connect tasks together and provide blocking communication but do not define any types of message. The rules of channels are:
        • Channels are synchronous. Outputting data on a channel end will cause the sending thread to block until the receiving thread has consumed that data.
        • Channels are declared with the keyword chan
        • Each channel has two endpoints.
        • Channel endpoinds are declared using the keyword chanend (ex: passed as parameters in functions running on separate threads)
        • Channels are lossless. So, each channel output must have a matching input.
        • The amount of data going out must be equal to that coming in.
        • Only one thread can use a given channel end (in or out). Hence, only two threads can use a given channel.
        • The output and input operators <: and :> are used to send and receive messages respectively. The operators send a value over the channel.
        • Ex:
          chan c;
          void task1 (chanend c) {
          	c <: 5;
          void task2 (chanend c) {
          select {
          	case c :> int i:
          		break ;
          	task1 (c) ;
          	task2 (c) ;
      • Transactions: Provides a means for channels to synchronize on the beginning and end of a transaction, while running asynchronously within the transaction. The main things to know about transactions are:
        • Transactions exist within a par statement.
        • A transaction consists of a “master” thread and a “slave” thread, running concurrently.
        • Each of the “master” and “slave” is a statement. It can be a block of code surrounded by curly braces (a bit like anonymous methods in javascript), or can be a function call.
        • Each transaction can communicate over one channel only.
        • The master thread blocks only if the channel buffer is full. While the slave thread blocks if there is no data to consume.
        • Ex:
          int send[10], receive[10];
          chan c;
          	master {
          		for(int i=0;i&lt;10;i++)
          			c <: send[[i];
          	slave {
          		for(int i=0;i&lt;10;i++)
          			c :> receive[i];
      • Streams: Streams are asynchronous, permanent, channels between two threads. Main points about streams:
        • They are declared as “streaming chan VarName”
        • They provide the highest data rates between threads.
        • Outputs to and inputs from streams take one instruction to complete as long as the channel buffer is not full.
        • Unlike transactions, streams can be processed concurrently, creating multi-threaded pipelines.
        • Ex:
          port dataIn = XS1_PORT_8A;
          port dataOut = XS1_PORT_8B;
          streaming chan s1, s2;
          par {
          	receiveData(dataIn, s1);
          	processData(s1, s2);
          	sendData(dataOut, s2);
      • Parallel Replication: a variation on the par statement that permits running the same function or block of code across numerous threads. It uses a similar format to the for-loop statement in C, but substitutes the “par” word instead of “for”. Ex:
        chan c[4];
        int someData[4];
        par (int i=0; i&lt;4; i++) { 
        	runMyFunction(c[i], c[(i+1)%4], someData[i]);
      • Services: Provide a means to communicate to external devices that implement xLinks, ex: FPGAs.
      • Interfaces: For people who are familiar with web services and data contracts, xC supports communication over predefined interfaces. XMOS Multicore Extensions to C Chapter 3 (pages 10-15) details the use of interfaces.
  • Clocks:
    • Clocks must be declared as global variables, and each initialized with a unique clock resource identifier.
    • To configure a clock rate call configure_clock_rate(clock clk, unsigned a, unsigned b) where “clk” is the clock to be configured, “a” is the dividend of the desired rate, and “b” is the divisor of the desired rate. The hardware supports rates of “ref” MHz and rates of the form (ref/2n) MHz where ref is the reference clock frequency and “n” is a number in the range 1 to 255 inclusive (copied from xs1.h).
    • To configure an output port as a clock port call configure_port_clock_output(void port p, const clock clk) where “p” is a 1-bit port, and “clk” is the clock block to output. If the port is not a 1-bit port, an exception is raised (from xs1.h).
    • To tie an output port to a clock port call configure_out_port(void port p, const clock clk, unsigned initial) where “p” is the output port, “clk” is the clock block, and “initial” is the initial value to output on the port. The port drives the initial value on its pins until an input or output statement changes the value driven.
    • Outputs are driven on the next falling edge of the clock and every port-width bits of data are held for one clock cycle. If the port is unbuffered, the direction of the port can be changed by performing an input. This change occurs on the falling edge of the clock after any pending outputs have been held for one clock period. Afterwards, the port behaves as an input port with no ready signals (from xs1.h).
    • To start a clock call start_clock(clock clk) where “clk” is the clock block to be put into running mode.
    • The ability to tie a clock port to an input/output port greatly simplifies the code for data input/output: ex:
      clock clk = XS1_CLKBLK_1;
      out port clkPort = XS1_PORT_1E;
      out port outPort = XS1_PORT_8A;
      configure_clock_rate(clk, 100, 1); //Drive the port at 100MHz?!
      configure_out_port(clkPort, clk, 0);
      configure_port_clock_output(outPort, clk);
      for(int i=0; i&lt;1000; i++)
      	outPort <: i;
    • Using an external clock source to drive a synchronous input port is very similar (page 45 of XMOS Programming Guide).
  • Timers: A timer is a xCORE resource with a 32-bit counter that is continually incremented at a rate of 100MHz.
    • Timers are declared using the keyword timer
    • Timers may be declared as local variables.
    • An input statement (:>) can be used to read the value of a timer’s counter.
    • Timers can be used to periodically perform an action using the when statement. Ex:
      timer t;
      unsigned int time;
      for(int i=0; i&lt;100; i++) {
      	t when timerafter(time) :> void;
      	time += XS1_TIMER_MHZ * 1000 * 1000;
      	printstr("This message is printed once a second \n");

      Note in the above examples the time input from the timer is discarded in the loop by inputting it to void. Because the processor completes the input shortly after the time specified is reached, had we input the time to a variable, the input in the loop may actually increment the value of time by a small amount. This amount may be compounded over multiple loop iterations, leading to drift over time.

    • It is similarly possible to perform a periodic action inside a select statement. Ex:
      timer t;
      unsigned int time;
      for(int i=0; i&lt;100; i++) {
      	select {
      		case t when timerafter(time) :> void :
      			time += XS1_TIMER_MHZ * 1000 * 1000;
      			printstr("This message is printed once a second \n");
      		// Insert cases to handle other events here ...
  • Port Buffering: A buffer can hold data output by the processor until the next falling edge of the port’s clock, allowing the processor to execute other instructions during this time. It can also store data sampled by a port until the processor is ready to input it. Using buffers, a single thread can perform I/O on multiple ports in parallel. This decouples the sampling and driving of data on ports from a computation.
    • Buffered ports are declared by adding the keyword buffered in the port declaration:
      in buffered port:8 inP = XS1_PORT_8A;
      out buffered port:8 outP = XS1_PORT_8B;
    • A :x defines the width of the buffer for a port, in this case 8-bit.
    • Tieing a clock to a buffered port, will cause the port’s buffer to store input or output values until the next clock cycle and simplifies the input/output code.
    • By tieing more than one port to the same clock, it is possible to drive those ports synchronously related to that clock, making those ports behave like one larger port.
    • Ex:
      out buffered port p :4 = XS1_PORT_4A;
      out buffered port q :4 = XS1_PORT_4B;
      out port clkPort = XS1_PORT_1E;
      clock clk = XS1_CLKBLK_1;
      configure_clock_rate(clk, 100, 8);
      configure_port_clock_output(outPort, clk);
      configure_out_port (p , clk , 0) ;
      configure_out_port (q , clk , 0) ;
      start_clock ( clk ) ;
      p <: 0; // start an output
      sync (p ); // synchronise to falling edge
      for ( char c= ' A ' ; c <= ' Z ' ; c ++) {
      	p <: ( c & 0 xF0 ) >> 4;
      	q </:><: ( c & 0 x0F );
    • The “sync()” function synchronizes the clock to the start of a clock period, ensuring the maximum amount of time before the next falling edge. This causes the processor to wait until the next falling edge on which the last data in the buffer has been driven for a full period, ensuring that the next instruction is executed just after a falling edge.
  • Serialisation and strobing:
    • Strobing is just another way of saying clocking, as in, a synchronous port like SPI.
    • A port can be configured to perform serialisation in hardware, useful if data must be communicated over ports that are only a few bits wide, and strobing, useful if data is accompanied by a separate data valid signal. Offloading these tasks to the ports frees up more processor time for executing computations.
    • A simple 8-bit SPI out-only port can be as simple as:
      out buffered port :8 outP = XS1_PORT_1A ;
      out port clkPort = XS1_PORT_1E;
      clock clk = XS1_CLKBLK_1;
      configure_clock_rate(clk, 100, 8);
      configure_port_clock_output(outPort, clk);
      configure_out_port ( outP , clk , 0) ;
      start_clock ( clk ) ;
      int x = 0xAA;
      outP <: x ;
    • By defining the width of the buffer of the port to be a 8-bits, while the port itself is one bit, will cause the port to use a shift register to transfer/serialize 8-bits of data at a time.
    • The same code can be used to deserialize data by only changing the buffered port to an in port instead and doing an :> (input) operation instead.
    • The XMOS Programming Guide provides a case study on how to implement a 100mbit Ethernet Media Independent Interface (MII) (pages 58-65)

No Comments yet »

RSS feed for comments on this post. TrackBack URI

Leave a comment

You must be logged in to post a comment.

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds. Valid XHTML and CSS. ^Top^
Free website monitoring service