I have designed a simple implementation of a UART reciever using Verilog. I did it using the state machine approach.
Here is my code:
module my_serial_receiver(
input clk,
input reset_n,
input Rx,
output reg [7:0] received_byte,
output reg byte_ready
);
parameter IDLE = 4'd0, BIT_0 = 4'd1, BIT_1 = 4'd2,
BIT_2 = 4'd3, BIT_3 = 4'd4, BIT_4 = 4'd5, BIT_5 = 4'd6,
BIT_6 = 4'd7, BIT_7 = 4'd8, BYTE_READY = 4'd9;
reg [3:0] state = 0;
reg [8:0] baud_clock = 0;
reg baud_sync = 0;
reg baud_tick = 0;
reg baud_reset = 0;
always @(posedge clk) begin
if (baud_reset) baud_clock <= 9'd1;
else if (baud_sync) begin
if (baud_clock == 9'd322) baud_clock <= 0;
else baud_clock <= baud_clock + 9'd1;
end
else begin
if (baud_clock == 9'd215) baud_clock <= 0;
else baud_clock <= baud_clock + 9'd1;
end
end
always @(*) begin
baud_tick <= ~|baud_clock;
end
always @(posedge clk or negedge reset_n) begin
if (~reset_n) begin
state <= IDLE;
received_byte <= 8'h0;
end
else begin
case(state)
IDLE: begin
byte_ready <= 0;
if (Rx == 0) begin
state <= BIT_0;
baud_reset <= 1;
baud_sync <= 1;
end
end
BIT_0: begin
baud_reset <= 0;
if (baud_tick) begin
baud_sync <= 0;
received_byte[0] <= Rx;
state <= BIT_1;
end
end
BIT_1: begin
if (baud_tick) begin
received_byte[1] <= Rx;
state <= BIT_2;
end
end
BIT_2: begin
if (baud_tick) begin
received_byte[2] <= Rx;
state <= BIT_3;
end
end
BIT_3: begin
if (baud_tick) begin
received_byte[3] <= Rx;
state <= BIT_4;
end
end
BIT_4: begin
if (baud_tick) begin
received_byte[4] <= Rx;
state <= BIT_5;
end
end
BIT_5: begin
if (baud_tick) begin
received_byte[5] <= Rx;
state <= BIT_6;
end
end
BIT_6: begin
if (baud_tick) begin
received_byte[6] <= Rx;
state <= BIT_7;
end
end
BIT_7: begin
if (baud_tick) begin
received_byte[7] <= Rx;
state <= BYTE_READY;
end
end
BYTE_READY: begin
if (baud_tick) begin
byte_ready <= 1;
state <= IDLE;
end
end
default: state <= IDLE;
endcase
end
end
endmodule
And here is a picture of my simulation results:

For the simulation I sent the bytes 0x55, 0x11, 0x32, 0x63, and 0xFF. The byte_ready signal is asserted at the correct time for each of those bytes (for exactly one clock cycle). My simulation appears to be working perfectly.
I have even simulated for varying errors in baud rate. (Note: I am designing this to work with a baud rate of 115200. The simulation still worked properly.
I have even used the Signaltap logic analyzer to confirm the incoming Rx signal. I even used Signaltap to observe the state progression of the system, but the state literally never changes. It stays right at the start even though I see Rx being received by the FPGA.
I have even changed it up to show different LEDs flashing for the states or bytes received. Nothing lights up.
It appears that the design is not reacting at all.
I am completely lost with regards of what to do next. Any help would be greatly appreciated.
EDIT:
I have managed to get the LEDs reacting. Now however it seems that the bytes I receive is completely random. Probing into the signals I realize the LSB of my state (state[0]) is progressing in a wrong manner, with comparison to the simulation.
It should be toggling for every single Rx bit received, but Signaltap reveals that it's doing something else.
Signaltap and Modelsim waves (The ModelSim one is what's supposed to be happening):

How can I fix this discrepancy?