Ethernet IPV4 UDP formatter =========================== This module provides a VHDL-based Ethernet IPv4 UDP packet formatter for data transmission, interfacing with a FIFO input for data and a PCS interface for output. It constructs and transmits Ethernet frames with embedded IPv4, UDP and SDN headers, calculates checksums, and manages packet transmission states. Generics -------- .. list-table:: :widths: 20 40 30 20 :header-rows: 1 * - Name - Description - Type - Default * - udp_payload_size_g - Size of the UDP payload in bytes, without UDP and SDN headers (maximum of 8920 -> 890 HotRIO paquets) - ``natural`` - 1424 * - sdn_topic_uid_g - SDN topic unique identifier - ``array_slv8(3 downto 0)`` - (x"00", x"00", x"00", x"00") * - sdn_topic_vers_g - SDN topic version - ``array_slv8(3 downto 0)`` - (x"00", x"00", x"00", x"00") Ports ----- .. list-table:: :widths: 20 40 30 :header-rows: 1 * - Name - Description - Type * - clk_i - TX clock of the PCS - ``std_logic`` * - rstn_i - Active-low, synchronous reset for the entire module - ``std_logic`` * - data_i - Input data vector for UDP payload bytes - ``std_logic_vector(7 downto 0)`` * - wren_i - Internal FIFO write enable, set high to include ``data_i`` in the UDP payload - ``std_logic`` * - mac_source_i - Input source MAC address for Ethernet frame - ``array_slv8(5 downto 0)`` * - mac_dest_i - Input destination MAC address for Ethernet frame - ``array_slv8(5 downto 0)`` * - ip_source_i - Input source IPv4 address for the packet - ``array_slv8(3 downto 0)`` * - ip_dest_i - Input destination IPv4 address for the packet - ``array_slv8(3 downto 0)`` * - udp_source_i - Input source UDP port number - ``array_slv8(1 downto 0)`` * - udp_dest_i - Input destination UDP port number - ``array_slv8(1 downto 0)`` * - sdn_send_time_i - SDN header timestamp, to be provided by an external module if needed - ``array_slv8(7 downto 0)`` * - data_o - Output data vector for PCS interface - ``std_logic_vector(7 downto 0)`` * - k_flag_o - Output K flag signal for PCS interface - ``std_logic`` * - correct_disp_o - Control signal for the PCS to select the right disparity during an IPG - ``std_logic`` * - packet_sent_o - Pulses for one clock cycle when a packet transmission has ended (before the IPG) - ``std_logic`` * - err_fifo_full_o - Pulses high when a write is attempted while the internal FIFO is full - ``std_logic`` * - cnt_packets_sent_o - Output counter for packets successfully sent - ``std_logic_vector(CONFIG_DATA_WIDTH - 1 downto 0)`` * - cnt_err_fifo_full_o - Output counter for dropped payload bytes due to full internal FIFO - ``std_logic_vector(CONFIG_DATA_WIDTH - 1 downto 0)`` .. image:: Resources/udp_formatter_architecture.png Input Interface --------------- This module assembles UDP payload bytes with MAC, IPv4, UDP and SDN header fields to produce valid 802.3z frames with fixed length, each containing a single non-fragmented UDP packet. An interpacket gap is generated between frames. The payload is provided by the user one byte at a time by setting ``data_i`` to the desired value while ``wren_i`` is held high. Bytes are appended to the payload and sent to the PCS interface in FIFO order. To prevent data loss, the user shall ensure that the duty cycle of ``wren_i`` is less than (``udp_payload_size_g``)/(``udp_payload_size_g`` + 117) over the time interval of each packet transmission. The following table shows three examples: .. list-table:: :header-rows: 1 * - Payload size - Write enable max duty cycle - Notes * - 10 - 7.8% - Smallest supported payload * - 1424 - 92.4% - Largest non-fragmented packet with MTU 1500 * - 8924 - 98.7% - Largest supported payload The module will monitor the level of its internal buffer, ensuring it only begins transmission when the buffer contains enough data to complete a full packet, and providing an error signal on ``err_fifo_full_o`` if it runs out of memory. Buffer depth is automatically determined so that it always exceeds the sum of payload size and total header size, allowing the user to keep sending bytes continuously until the first payload is provided to the PCS. Header fields are saved in a shift register near the start of packet transmission, when the start of frame delimiter (SFD) is sent. The desired changes to these ports must be applied no more than 12 clock cycles after the ``packet_sent_o`` pulse. The ``sdn_send_time_i`` timestamp should be calculated by a dedicated module, taking into account initial absolute time information provided from outside the FPGA. If this port is not assigned, it will default to zero. MAC and IPv4 addresses and IPv4 ports shall be provided in big endian format, while SDN header fields (topic UID, topic version and timestamp) shall be provided in little endian format. PCS Interface ------------- This module is compatible with the `ecp5_dual_wrapper_8b` PCS interface and consists of three main signals: an 8-bit data line, a k-character flag, and a disparity correction flag. The disparity correction flag pulses for one clock cycle upon entering the Interpacket Gap, enabling the PCS to reset disparity and begin the next frame with negative disparity. Implementation Details ---------------------- Packet Description ~~~~~~~~~~~~~~~~~~ The main packet structure is the following: - Start of Packet (SOP) - A single **0xFB** k-character - Preamble - Six 0x55 data bytes - Start of frame delimiter - A single 0xD5 data byte - MAC header (14 bytes) - IPv4 header (20 bytes) - UDP header (8 bytes) - SDN header (48 bytes) - Data - Frame check sequence, or FCS - 4 bytes of CRC32 of the Ethernet packet content, from MAC header to the end of data - Interpacket gap (12 bytes minimum) - Made up of alternating **0xBC** special characters and 0x50 data characters MAC header ~~~~~~~~~~ The MAC header consists of the following fields: - MAC destination, 6 bytes - MAC source, 6 bytes - Ether-type (set to 0x0800 for an IPv4 protocol), 2 bytes IPV4 header ~~~~~~~~~~~ The IPV4 header structure is shown in the table below, 32 bits per row. +---------+-------+---------------+---------------+-----------------+ | Version | IHL | Type | Total Length | +---------+-------+---------------+---------------+-----------------+ | Identification | Flags | Fragment Offset | +---------+-------+---------------+---------------+-----------------+ | Time to Live | Protocol | Header Checksum | +---------+-------+---------------+---------------+-----------------+ | Source IP Address | +---------+-------+---------------+---------------+-----------------+ | Destination IP Address | +---------+-------+---------------+---------------+-----------------+ Table reference: RFC 791 - Version (4 bits), set to 0x4 for IPv4 - Internet Header Length (4 bits), set to 0x5 for a 20 bytes header (the minimum) - Type of service (1 byte), set to 0x0 for normal service - Total length (2 bytes), calculated during synthesis with the following formula: ipv4_length = 20 + 8 + 48 + udp_payload_size_g - Identification (2 bytes), a big endian 16-bit counter incremented at each packet sent - Flags (3 bits), set to 0x4 to "don't Fragment" - Fragment offset (13 bits), not used, set to 0x000 - Time to live (1 byte), set at 0xFF - Protocol (1 byte), set at 0x11 for the UDP protocol - The header checksum (2 bytes) is calculated dynamically. The explanation of this step is in the "Ipv4 checksum" section - Source IP Address (4 bytes), taken from the input port - Destination IP Address (4 bytes), taken from the input port UDP header ~~~~~~~~~~ +--------+--------+--------+--------+ | Source Port | Dest Port | +--------+--------+--------+--------+ | Length | Checksum | +--------+--------+--------+--------+ Source RFC 768 The UDP header is made up of 4 sections, each one of 16 bits: - Source port, taken from the input port - Destination port, taken from the input port - length, calculated with the following formula during synthesis: udp_length = 8 + udp_payload_size_g - Checksum, unused, set to 0x0000 SDN Header ---------- This module will include a SDN header inside the UDP packet. The SDN header is made up of the following fields: - Header unique identifier (4 bytes) - Header version (4 bytes) - Header size (4 bytes) - Topic unique identifier (4 bytes) - Topic version (4 bytes) - Topic size (4 bytes) - Topic counter (8 bytes) - Packet send time (8 bytes) - Packet receive time (8 bytes) Header UID, version and size are hardcoded to the values expected by the SDNv2.x core library. The topic UID and version can be set by the user at compile time through generics. Topic size and topic counter are generated inside this module, taking into account the requested payload length for the former, and incrementing the latter after each transmission operation. Finally, the packet receive timestamp is hardcoded to zero, as this module acts as a source of SDN data and does not receive any packets. Main state machine ~~~~~~~~~~~~~~~~~~ The main state machine is made up of 10 states. The states are selected one after the other in sequence. - IPG During this state, the module sends an IDLE2 pattern to the PCS (**0xBC** 0x50), correcting running disparity by inserting an IDLE1 code group (**0xBC** 0xC5) if needed to guarantee that subsequent code groups are aligned on negative disparity boundaries. The start of transmission of the Ethernet frame is triggered when three conditions are fulfilled: - The amount of data in the internal FIFO buffer is enough for an entire transmission, set by udp_payload_size_g. - At least 12 bytes of interpacket gap pattern have been sent. - A complete IDLE2 code group has just been sent. - PREAMBLE In this state, the preamble of the Ethernet frame is sent: **0xFB** 0x55 0x55 0x55 0x55 0x55 0x55 0xD5. Simultaneously, the module computes the checksum of the IPv4 header based on the IP addresses provided by the user at the beginning of the state. Note that header fields are not yet registered in this state; this may result in checksum errors if the IP addresses are changed while calculation is in progress. To avoid these errors, ensure that input IP addresses are updated after no more than 12 cycles from assertion of ``packet_sent_o``. MAC, UDP and SDN fields may be updated later (up to 20 cycles after ``packet_sent_o``), but it is advisable to change all header fields at the same time to prevent the potential transmission of inconsistent headers. - HEADER At the beginning of this state, all user-defined header fields are saved into a shift register, which proceeds to provide them one byte at a time at the PCS interface output. The internal FIFO read enable is asserted two cycles before the end of this process to account for Lattice ECP5 block memory latency, so that payload data is available at the beginning of the following state. - PAYLOAD Payload bytes are fetched from the internal buffer and routed to the PCS interface output. Two cycles before the end of the process, the read enable is deasserted to compensate for its anticipated assertion in the previous state. - FINISH_CRC During the `header` and `payload` states, data has been continuously intercepted by a CRC32 module, updating the value of the Ethernet Frame Check Sequence while the packet is being transmitted. This one-cycle state gives time for this module to finish calculation before sending the FCS. No data is appended to the packet in this state, but transmission continues without any gaps as payload bytes travel to the module output through an intermediate register, while the frame check sequence, in the next state, is sent directly to the output. - FCS As mentioned in the previous state, here the Frame Check Sequence is sent directly to the PCS interface output, bypassing the intermediate register to recover the delay needed to finish CRC calculation after extracting the last payload byte from the internal buffer. The first character of the End of Packet Delimiter sequence, **0xFD**, is also preloaded into the intermediate register, restoring normal data flow as soon as the state ends. - EPD In this one-cycle state, the carrier extend special character **0xF7** is appended to the packet content, completing the alignment-independent part of the End of Packet Delimiter sequence started by the preloaded value from the previous state. The following state may be `realign` or `ipg` depending on the parity of the payload size. When this state is reached, transmission of meaningful packet data has finished; thus, ``packet_ready_o`` pulses high and the packet counter, IPv4 identification number and SDN topic counter are incremented. - REALIGN If the payload size is odd, the 802.3z frame as generated above would end on an odd-aligned byte boundary. Since this would result in a PCS synchronization error on the receiver side, an additional carrier extend special character is appended to the data stream, so that the interpacket gap is guaranteed to begin on an even-aligned byte boundary, then the state machine restarts from the `ipg` state. Ipv4 checksum ~~~~~~~~~~~~~ The IPv4 header checksum is computed for each packet while the main state machine is sending the frame preamble. The checksum calculation is performed by adding up the IPv4 header sections in 16-bits chunks. When an overflow occurs, the carry bit is added to the result. The checksum state machine does this only in the last step. It uses a 18 bit adder for all the intermediate steps, then it produces a 16-bit checksum by removing the three most significant bits of the sum, adding them to the remaining value. The calculation process takes 8 cycles, including state machine latency. At the end of the `preamble` state, the updated checksum is inserted in the IPv4 header, to be sent in the following state. The checksum state machine contains the following states: - IDLE The state machine waits for a start signal to begin calculation. The signal is internally provided by the main state machine at the end of the interpacket gap state. - SUM In this state, one 16-bit chunk of the 160-bit IPv4 header is added to a 18-bit accumulator register at each cycle. The register is initially set to the sum of all constant fields of the header, so that calculation is only needed for the ones that can change: source and destination IP addresses and identification number. - CARRY_ADD In this state, the final result is calculated by adding excess bits to the accumulator and inverting the result. The checksum is registered and made available to the main process, while the state returns to `idle` until a new packet transmission begins. Ethernet CRC32 ~~~~~~~~~~~~~~ The Ethernet frame is closed with four bytes that correspond with the Frame Check Sequence. This value is a CRC32 calculation on the entire frame, including the payload and all the headers, but not the preamble. The vhdl code is based on an automatically generated CRC table, which combinationally provides an updated CRC given the previous CRC value and the current byte to be sent. At the end of the frame, the FCS is inverted and its endianness is swapped, according to the Gigabit Ethernet specification. The CRC32 state machine contains the following states: - IDLE The state machine waits for a start signal to begin calculation. This is internally provided by the main state machine at the end of the preamble state. The initial CRC value is set to 0xFFFFFFFF, resulting in the same outcome as would occur if the initial value were 0 but the first four bytes were inverted (which would be the specification requirement). - COMPUTE In this state, the lookup table is used to update the content of the CRC register depending on the byte currently being sent to the PCS interface, which is acquired from the intermediate register between the main state machine and the output. This loop is interrupted upon reception of a stop signal from the main state machine, sent at the end of the payload state. - SEND After the stop signal is received, the CRC register is rewired in a shift register configuration, so that it may produce the final value byte by byte on its output with the desired endianness. After transmission of the four CRC bytes, this state machine is restored to its initial conditions. Internal FIFO buffer details ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As mentioned in the Input Interface section, payload bytes are stored in a FIFO memory until they are in sufficient number to assemble and send a UDP datagram of the desired length. In vhdl, this component is defined as a single-clock FIFO IP core instance targeting Lattice ECP5UM-5G. The specific module chosen for instantiation is determined at compile time based on the requested payload size. Each option has a different power of 2 as depth, from a minimum of 64 to a maximum of 16384. Whenever the chosen depth is 2048 or greater, dedicated block RAM resources are used to synthesize the module. All FIFO options are 1-byte wide and feature synchronous reset, empty and full flags and a data counter indicating the amount of bytes currently stored in the module's RAM. When attempting to transfer a value from the FIFO to a register, the effective latency of the read operation is 2 clock cycles, as the requested value will be visible at the output of the memory module shortly after the active edge that follows the request. The module is wired in such a way that it is not possible to write the FIFO while full or to read it while empty, as long as single bit errors do not occur in the control logic of the UDP formatter. If the user attempts to write the FIFO while full, data will be ignored, an error flag will be raised for one clock cycle and an error counter will be incremented.