diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml new file mode 100644 index 0000000..043dccc --- /dev/null +++ b/.github/workflows/build.yml @@ -0,0 +1,21 @@ +name: Build + +on: + push: + branches: [ main ] + pull_request: + branches: [ main ] + +jobs: + build: + + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v2 + + - name: Install dependencies + run: sudo apt-get install -qq scons gcc-avr binutils-avr avr-libc + + - name: Build + run: scons diff --git a/.travis.yml b/.travis.yml deleted file mode 100644 index cf0e0b6..0000000 --- a/.travis.yml +++ /dev/null @@ -1,6 +0,0 @@ -language: c -before_install: - - sudo apt-get update -qq -install: - - sudo apt-get install -qq scons gcc-avr binutils-avr avr-libc -script: scons diff --git a/README.md b/README.md index e9bd6c7..d87ccf9 100644 --- a/README.md +++ b/README.md @@ -1,20 +1,39 @@ -# nanoBoot +# nanoBoot (w/LED) -[![Build Status](https://travis-ci.org/volium/nanoBoot.svg?branch=master)](https://travis-ci.org/volium/nanoBoot) +## HID bootloader with LED support & overwrite protection -This repository contains the source code for the USB HID-based bootloader for ATmegaXXU4 family of devices. + -The name *nanoBoot* comes from the fact that the compiled source fits in the smallest available boot size on the ATMegaXXu4 devices, 256 words or 512 bytes. The code is based on Dean Camera's [LUFA](https://github.com/abcminiuser/lufa) USB implementation, but it is **EXTREMELY** streamlined, size-optimized and targeted for the [ATmega16U4](http://www.atmel.com/devices/atmega16u4.aspx) and [ATmega32u4](http://www.atmel.com/devices/atmega32u4.aspx) devices; I had to make quite a few hardware assumptions, mostly to the fuse settings related to clock configuration for things to be as compact as possible, but the code still allows for some flexibility. +This repository [nanoBoot w/LED](https://github.com/osamuaoki/nanoBoot) contains the source code for the USB HID-based bootloader for ATmega32U4 family of devices with **LED support and overwrite protection**. -It's very likely that a few sections can be rewritten to make it even smaller, and the ultimate goal is to support EEPROM programming as well, although that would require changes to the host code. +The name **nanoBoot** comes from the fact that the compiled source fits in the smallest available boot size on the ATMega32u4 devices, 256 words or **512 bytes**. The code is based on Dean Camera's [LUFA](https://github.com/abcminiuser/lufa) USB implementation, but it is **EXTREMELY** streamlined, size-optimized and targeted for the [ATmega16U4](http://www.atmel.com/devices/atmega16u4.aspx) and [ATmega32u4](http://www.atmel.com/devices/atmega32u4.aspx) devices. -The current version (commit #[d0ea26b](https://github.com/volium/nanoBoot/commit/d0ea26bb01e764340dc8ad7b473ad98cefdb52eb)) is supported as-is in the 'hid_bootloader_loader.py' script that ships with [LUFA-151115](https://github.com/abcminiuser/lufa/releases/tag/LUFA-151115), and is exactly 506 bytes long. +Initial and major portion of manual assembly code optimization efforts were performed by [volium](https://github.com/volium) and published as the original [volium/nanoBoot](https://github.com/volium/nanoBoot). + +Some tweaks were performed by osamu to allow arbitrary setting for CKDIV8 fuse and it was merged to the upstream. + +There were a lot of manual size optimization and program size check feature addition by [sigprof](https://github.com/sigprof) and published as [sigprof/nanoBoot](https://github.com/sigprof/nanoBoot) + +Osamu gathered all useful code and made a linear history commits with his LED support added as **led** branch here at [osamuaoki/nanoBoot](https://github.com/osamuaoki/nanoBoot). + +Binary size: +* 468 bytes (proposed to upstream as default) + * no LED +* 474 bytes (opt-in for upstream but default in my branch) + * enable LED support with "LED_ACTIVE_LEVEL 1" (Leonardo, Nano, Teensy 2.0-type) +* 476 bytes (opt-in) + * enable LED support with "LED_ACTIVE_LEVEL 0" (Promicro-type) + +The current version (2021-12-08) will be tested manually with the compiled `hid_bootloader_cli.c` from [LUFA](https://github.com/abcminiuser/lufa) on Debian GNU/Linux 12 (bookworm/testing). + +Required packages on Debian GNU/Linux system: `gcc-avr`, `avr-libc`, `binutils-avr`, `libusb-dev`, `build-essential`, `git` ## HW assumptions: * CLK is 16 MHz Crystal and fuses are setup correctly to support it: * Select Clock Source (CKSEL3:CKSEL0) fuses are set to Extenal Crystal, CKSEL=1111 SUT=11 - * Divide clock by 8 fuse (CKDIV8) can be set to either 0 or 1 + * Divide clock by 8 fuse (CKDIV8) can be any value. + * 16 MHz operation needs 5V VCC for MCU * Bootloader starts on reset; Hardware Boot Enable fuse is configured, HWBE=0 * Boot Flash Size is set correctly to 256 words (512 bytes), StartAddress=0x3F00, BOOTSZ=11 * Device signature = 0x1E9587 @@ -24,10 +43,64 @@ The current version (commit #[d0ea26b](https://github.com/volium/nanoBoot/commit * hfuse memory = 0xD6 (EESAVE=0, BOOTRST=0) * efuse memory = 0xC7 (=0xF7, No BOD) -* Alternatively, BOD can be used to ease CKSEL-SUT setting requirements to - allow teensy-like FUSE settings: +* Alternatively BOD can be used to ease CKSEL-SUT setting requirements to + allow teensy like FUSE setting * lfuse memory = 0x5F (CKDIV8=0, 16CK + 0ms) * hfuse memory = 0xDF (EESAVE=1, BOOTRST=1) - * efuse memory = 0xF4 (BOD=2.4V) + * efuse memory = 0xC4 (=0xF4, BOD=2.4V) + +* LED on D6 port for Teensy 2.0 (Configurable in #define for any board) + +## Usage + +Please install this bootloader `nanoBoot.hex` using the ISP connected programmer (e.g. AVRISP mkII). + +``` +$ sudo avrdude -v -p atmega32u4 -c avrisp2 -Pusb -e -U flash:w:nanoBoot.hex \ + -U lfuse:w:0x5f:m -U hfuse:w:0xdf:m -U efuse:w:0xc4:m +``` + +You can start this bootloader by connecting the board to the PC with USB cable and pressing the RESET button. It is good idea to monitor the PC's USB connection. + +``` + $ watch lsusb +``` + +If this bootloader is started, you should see "Atmel". + +Please note, now this bootloader turns on LED just before sending device ID. Thus monitoring of USB is now optional. + +(If LED doesn't turn on even after 10 second wait for any reason, press the RESET button again.) + +Then program MCU with, e.g., a `LED.hex` firmware as: + +``` + $ sudo hid_bootloader_cli -mmcu=atmega32u4 -v LED.hex +``` +Please note, this bootloader turns off LED upon finish programming. + +(Pressing the RESET button during active bootloader execution seems to halt the bootloader. This seems to be the reason you need to press the RESET button again.) + +For your convenience, pre-compiled HEX file and associated scripts are provided under the `precompile` directory. + +## Configuration + +Only the first configuration choice is tested with a Teensy 2.0 compatible board. + +In `Makefile`: + +* `F_CPU = 16000000` or `F_CPU = 8000000` +* `BOOT_START_OFFSET = 0x7E00` or any valid ones for MCU + +In `nanoBoot.S`: + +* Adjust `#define LED_BIT`, `#define LED_CONF`, and `#define LED_PORT` , and `#define LED_ACTIVE_LEVEL` for each board. Default is Teensy 2.0 setting. + +## Documentation + +"The documentation is part of the source code itself, and even though some people may find it extremely verbose, I think that's better than lack of documentation; after all, assembly can be hard to read sometimes... ohhh yes, in case that was not expected, this is all written in pure GAS (GNU Assembly), compiled using the [Atmel AVR 8-bit Toolchain](http://www.atmel.com/tools/atmelavrtoolchainforwindows.aspx)." (per volium) + +"The elegant programming techniques presented by volium with detailed comments were very enlightening for me to get started. It's delightful for me to read. Don't miss it!" (per osamu) -The documentation is part of the source code itself, and even though some people may find it extremely verbose, I think that's better than lack of documentation; after all, assembly can be hard to read sometimes... ohhh yes, in case that was not expected, this is all written in pure GAS (GNU Assembly), compiled using the [Atmel AVR 8-bit Toolchain](http://www.atmel.com/tools/atmelavrtoolchainforwindows.aspx). + * [AVR Instruction Set Manual](http://ww1.microchip.com/downloads/en/devicedoc/atmel-0856-avr-instruction-set-manual.pdf) + * [ATmega16U4, ATmega32U4 - Complete Datasheet](http://ww1.microchip.com/downloads/en/devicedoc/atmel-7766-8-bit-avr-atmega16u4-32u4_datasheet.pdf) diff --git a/nanoBoot.S b/nanoBoot.S index a2497e3..83da694 100644 --- a/nanoBoot.S +++ b/nanoBoot.S @@ -48,9 +48,83 @@ ; hfuse memory = 0xDF ; efuse memory = 0xC4 + +; ========================================================== +; LED SUPPORT START + +; Turn LED on while bootloader is active +; NOTE: This feature uses 6 bytes for active high and 8 bytes for active low; +; see "Enable LED" in the "run_bootloader" section for details. + +; Uncomment the following line to enable LED feature +#define LED_ENABLED + +; LED Configuration +; Uncomment or add a new LED configuration for your specific board + +; Adafruit's Atmega32u4 Breakout Board (Product ID: 296) - Now discontinued +; https://www.adafruit.com/product/296 +; -- LED is ON with ATmega32u4 PIN E6 HIGH +; #define LED_BIT 6 +; #define LED_CONF DDRE +; #define LED_PORT PORTE +; #define LED_ACTIVE_LEVEL 1 + +; Teensy 2.0 compatible board +; -- LED is ON with ATmega32u4 PIN D6 HIGH +#define LED_BIT 6 +#define LED_CONF DDRD +#define LED_PORT PORTD +#define LED_ACTIVE_LEVEL 1 + +; Leonardo/Nano compatible board +; -- LED is ON with ATmega32u4 PIN C7 HIGH +; #define LED_BIT 7 +; #define LED_CONF DDRC +; #define LED_PORT PORTC +; #define LED_ACTIVE_LEVEL 1 + +; Pro Micro compatible board +; -- LED is ON with ATmega32u4 PIN D5 LOW +; #define LED_BIT 5 +; #define LED_CONF DDRD +; #define LED_PORT PORTD +; #define LED_ACTIVE_LEVEL 0 + +; Pro Micro compatible board +; -- LED is ON with ATmega32u4 PIN B3 LOW +; #define LED_BIT 3 +; #define LED_CONF DDRB +; #define LED_PORT PORTB +; #define LED_ACTIVE_LEVEL 0 + +#if defined(LED_ENABLED) + #if !defined(LED_PORT) || !defined(LED_CONF) || !defined(LED_BIT) || !defined(LED_ACTIVE_LEVEL) + #error "If LED feature is enabled, the following need to be defined: LED_BIT, LED_CONF, LED_PORT, LED_ACTIVE_LEVEL" + #else + ; Set IO register as output for LED + #define ENABLE_LED_OUTPUT sbi _SFR_IO_ADDR(LED_CONF), LED_BIT + #if LED_ACTIVE_LEVEL == 1 + #define TURN_LED_ON sbi _SFR_IO_ADDR(LED_PORT), LED_BIT + #define TURN_LED_OFF cbi _SFR_IO_ADDR(LED_PORT), LED_BIT + #elif LED_ACTIVE_LEVEL == 0 + #define TURN_LED_ON cbi _SFR_IO_ADDR(LED_PORT), LED_BIT + #define TURN_LED_OFF sbi _SFR_IO_ADDR(LED_PORT), LED_BIT + #else + #error "LED_ACTIVE_LEVEL needs to be either 1 (active high) or 0 (active low)" + #endif + #endif +#endif + +; LED SUPPORT END +; ========================================================== + + ; SW assumptions: -; All Endpoints are being configured sequentially in ascending order, -; but, since we only use EP0, this is not that important +; The bootloader only "needs" endpoint 0; however, the HID spec requires any HID device to have an +; Interrupt IN endpoint, and the host can decide to poll that endpoint even when the HID report +; descriptor does not actually declare any input reports. Because of this, endpoints 0 and 1 are +; configured (in reversed order). See commit bd1bd68e200485aa16445c9d22dd53bb205ee102 for details. ; Register Assignments: @@ -123,6 +197,14 @@ #define oUEBCHX (UEBCHX - USB_BASE) #define oUEINT (UEINT - USB_BASE) ; This register has the bits to identify which endpoint triggered an interrupt +; +; To facilitate coding, we will also use the Y register to point to the first Extended IO register; +; We can then use LDD / STD (Y+oU....) to address non-USB extended IO registers (EIO_BASE + relative offset) +; (These are used only in start-up and exit routines when USB is not active) +; +#define EIO_BASE WDTCSR +#define oWDTCSR (WDTCSR - EIO_BASE) +#define oCLKPR (CLKPR - EIO_BASE) #include @@ -131,9 +213,6 @@ # define BOOT_ADDRESS 0 #endif -; For debugging purposes -.equ LED_PIN, 6 - .section .vectors ; We still want the reset vector to jump to "main" @@ -310,14 +389,16 @@ main: in rMCUSR, _SFR_IO_ADDR(MCUSR) ; Load MCU Status Register to rMCUSR out _SFR_IO_ADDR(MCUSR), rZERO ; Load MCU Status Register with rZERO (clear reset flags, particularly clear WDRF in MCUSR), necessary before disabling the Watchdog + ; Use Y+ for different purpose with YH=R29 to 0 for addressing extended io for any 64 bytes of YL specified section + ; * WDT initialization routine: YL=lo8(EIO_BASE) --- (wdt_init) -- start and end of bootloader + ; * USB communication routine: YL=lo8(USB_BASE) --- (usb_init) -- main part of bootloader + clr YH ; 0 = hi8(USB_BASE) = hi8(EIO_BASE) = 0 common initialization + ; We MUST disable the Watchdog Timer first, otherwise it will remain enabled and will keep resetting the system, so... ; Disable Watchdog Timer - ldi r16, _BV(WDCE) | _BV(WDE) ; Load r16 with the value needed to "unlock" the Watchdog Timer Configuration - ; Write a logic one to the Watchdog Change Enable bit (WDCE) and Watchdog System Reset Enable (WDE) - mov r17, rZERO ; Load r17 with zero to disable the Watchdog Timer completely - rcall set_watchdog_timer ; Call the subroutine that sets the wathdog timer with the values loaded in r16 and r17 + rcall set_watchdog_timer ; Call the subroutine that sets the watchdog timer with the value loaded in r17 ; check_reset_flags: sbrs rMCUSR, EXTRF ; Skip the next instruction if EXTRF is set (if External Reset Flag, skip next instruction, go to run_bootloader) @@ -328,6 +409,19 @@ run_application: ; We get here if the cause of th run_bootloader: set ; Initialize BootLoaderActive flag (T flag in SREG) +; Enable LED +#if defined(LED_ENABLED) + ; Set IO register as output for LED + ENABLE_LED_OUTPUT + + ; If the LED is active low, we need to turn it off here (set LED_BIT) since the MCU IO port is + ; initialized as 0. This is a 2-byte penalty when using active low LED. + #if LED_ACTIVE_LEVEL == 0 + TURN_LED_OFF + #endif +#endif + + ; ================================================================= ; = Setup IRQ Vector Table @@ -377,8 +471,9 @@ run_bootloader: ; code to work as expected. ldi r17, _BV(CLKPCE) ; Load r17 with the value needed to "unlock" the prescaler of the Clock; Clock Prescaler Change Enable bit (CLKPCE) set to one, all other bits set to zero. - sts CLKPR, r17 ; Store r17 to the Clock Prescaler Register (CLKPR) - sts CLKPR, rZERO ; Store rZERO to the Clock Prescaler Register (CLKPR), setting CLKPS3, CLKPS2, CLKPS1 and CLKPS0 to zero (Clock Division Factor = 1; System Clock is 16 MHz) + ; still YH=0, YL=lo8(EIO_BASE) initial routine + std Y+oCLKPR, r17 ; Store r17 to the Clock Prescaler Register (CLKPR) + std Y+oCLKPR, rZERO ; Store rZERO to the Clock Prescaler Register (CLKPR), setting CLKPS3, CLKPS2, CLKPS1 and CLKPS0 to zero (Clock Division Factor = 1; System Clock is 16 MHz) ; ================================================================= ; = Basic device setup is NOW COMPLETE!! @@ -389,8 +484,9 @@ run_bootloader: ; = Configure Y register to point to USB_BASE (UHWCON register) ; ================================================================= - ldi YL, lo8(USB_BASE) ; Load YL with the least significant 8 bits of USB_BASE - ldi YH, hi8(USB_BASE) ; Load YH with the most significant 8 bits of USB_BASE + ldi YL, lo8(USB_BASE) ; Load YL with the least significant 8 bits of USB_BASE (usb_init) + ; still YH=0 + ; ldi YH, hi8(USB_BASE) ; Load YH with the most significant 8 bits of USB_BASE ; ================================================================= ; = From LUFA simplified - USB_Init:_start @@ -408,13 +504,12 @@ run_bootloader: ; USBCON &= ~(1 << VBUSTE); ; USBCON &= ~(1 << USBE); - ; IMPORTANT NOTE: To reduce code size, we are going to reseve r16 to handle all writes to the USB Controller Register (USBCON) - ; this way we don't have to keep loading the value to it (ldd) - ldd r16, Y+oUSBCON ; Load r16 with the value in the USB Configuration Register (USBCON) - - ; The right value of USBCON is already in r16, just clear VBUS Pad Enable Bit (OTGPADE), - ; VBUS Transition Interrupt Enable Bit (VBUSTE) and USB macro Enable Bit (USBE) - andi r16, ~(_BV(OTGPADE)|_BV(VBUSTE)|_BV(USBE)) + ; SIZE OPTIMIZATION: Instead of resetting just some specific bits, initialize the whole USBCON + ; register with its reset value (although even this could be omitted, this initialization is left + ; here in case the application tries to enter the bootloader in a slightly incorrect way). + ; As a further optimization, the USBCON register value is left in r16 for use in subsequent code + ; which modifies various bits of that register. + ldi r16, _BV(FRZCLK) ; Load r16 with the reset value for the USB Configuration Register (USBCON) std Y+oUSBCON, r16 ; Store r16 to the USB Configuration Register (USBCON) ; Enable USB Regulator (USB_REG_On) @@ -513,9 +608,18 @@ main_loop: exit_bootloader: ; Detach device from USB Bus ; UDCON |= (1 << DETACH); - ldd r16, Y+oUDCON ; Load r16 with the value in the USB Device Configuration Register (UDCON) - ori r16, _BV(DETACH) ; Set the DETACH bit to enable the detachment - std Y+oUDCON, r16 ; Store r16 to the USB Device Configuration Register (UDCON) + ; SIZE OPTIMIZATION: All other UDCON bits except DETACH can be set to 0 at this time, and the value + ; of _BV(DETACH) is 0x01, therefore we can just store rONE into UDCON. + ; In theory this step could even be removed completely, because the watchdog reset should set the + ; DETACH bit anyway, but doing this here ensures that the host detects the USB device detach before + ; the application is started, which could avoid issues if the application does not add some delay + ; before enabling USB. + std Y+oUDCON, rONE ; Store _BV(DETACH) (== 0x01) to the USB Device Configuration Register (UDCON) + +#if defined(LED_ENABLED) + ; Turn LED off before exiting + TURN_LED_OFF +#endif ; ================================================================= ; = Watchdog Timer initialization @@ -524,13 +628,10 @@ exit_bootloader: ; NOTE!! This part of the code assumes MCUSR has already been cleared ; Enable WDT, ~250 ms timeout (force a timeout to reset the AVR) - ldi r16, _BV(WDCE) | _BV(WDE) ; Load r16 with the value needed to "unlock" the Watchdog Timer Configuration - ; Write a logic one to the Watchdog Change Enable bit (WDCE) and Watchdog System Reset Enable (WDE) - ldi r17, _BV(WDE) | _BV(WDP2) ; Load r17 with the value needed to set the desired Watchdog Configuration (WDCE = 0, not set!) ; Write the WDE and Watchdog prescaler bits (WDP); System Reset Mode (WDE = 1) and ~250 ms timeout (WDP2 = 1) - rcall set_watchdog_timer ; Call the subroutine that sets the wathdog timer with the values loaded in r16 and r17 + rcall set_watchdog_timer ; Call the subroutine that sets the watchdog timer with the value loaded in r17 ; for (;;); final_loop: @@ -553,29 +654,55 @@ USB_General_ISR: ; service_EORSTI: ; unused label ; ================================================================= -; = Configure Endpoint0 +; = Configure Endpoints ; ================================================================= - ; ASSUMPTION! - ; We only use Endpoint0, and the reset value of the USB Device Select Endpoint Number Register (UENUM) is Zero, - ; so we don't need to select it or do anything else + ; Even though the bootloader uses only endpoint 0, the HID spec requires any HID device to have an + ; Interrupt IN endpoint, and the host can decide to poll that endpoint even when the HID report + ; descriptor does not actually declare any input reports. Polling an unconfigured endpoint causes + ; USB errors, therefore endpoint 1 must be configured here too. - ; Enable Endpoint + ; Enable and configure endpoint 1 as Interrupt IN: + ; UENUM = 1; + ; UECONX |= (1 << EPEN); + ; UECFG0X = (1 << EPTYPE1) | (1 << EPTYPE0) | (1 << EPDIR); + ; UECFG1X = (1 << EPSIZE1) | (1 << EPSIZE0) | (1 << ALLOC); + + std Y+oUENUM, rONE ; Select Endpoint 1 + + ; Set Endpoint Enable Bit (EPEN), all other bits set to zero has no effect on UECONX + std Y+oUECONX, rONE ; Store the USB Endpoint Configuration Register (UECONX) with the value needed to enable Endpoint 1 + + ldi r16, (_BV(EPTYPE1) | _BV(EPTYPE0) | _BV(EPDIR)) ; Load r16 with the value to configure Endpoint 1 + ; Endpoint Type Bits (EPTYPE1:0); 11 to set as Interrupt Endpoint + ; Endpoint Direction Bit (EPDIR); set to configure IN direction + + std Y+oUECFG0X, r16 ; Store r16 to the USB Endpoint Configuration0 Register (UECFG0X); + + ldi r16, (_BV(EPSIZE1) | _BV(EPSIZE0) | _BV(ALLOC)) ; Load r16 with the value to configure Endpoint 1 (and also 0 below) + ; Endpoint Size Bits (EPSIZE2:0); 011 to set to 64 bytes + ; Endpoint Bank Bits (EPBK1:0); 00 to set One bank + ; Endpoint Allocation Bit (ALLOC); set to allocate the endpoint memory + + std Y+oUECFG1X, r16 ; Store r16 to the USB Endpoint Configuration1 Register (UECFG1X); + + ; Enable and configure endpoint 0 as Control (this is done last, so that endpoint 0 will remain selected): + ; UENUM = 0; ; UECONX |= (1 << EPEN); ; UECFG0X = 0; - ; UECFG1X = 0x32; + ; UECFG1X = (1 << EPSIZE1) | (1 << EPSIZE0) | (1 << ALLOC); + + std Y+oUENUM, rZERO ; Select Endpoint0 + ; Set Endpoint Enable Bit (EPEN), all other bits set to zero has no effect on UECONX - std Y+oUECONX, rONE ; Store the USB Endpoint Configuration Register (UECONX) with the value needed to enable Enpoint 0 + std Y+oUECONX, rONE ; Store the USB Endpoint Configuration Register (UECONX) with the value needed to enable Endpoint 0 ; SIZE OPTIMIZATION: Not needed due to known reset value (Zero) ; std Y+oUECFG0X, rZERO ; Store rZERO to the USB Endpoint Configuration0 Register (UECFG0X); ; Endpoint Type Bits (EPTYPE1:0): 00 to set as Control Endpoint ; Endpoint Direction Bit (EPDIR): clear to configure OUT direction; needed for Control Endpoint - ldi r16, (_BV(EPSIZE1) | _BV(EPSIZE0) | _BV(ALLOC)) ; Load r16 with the value to configure Enpoint 0 - ; Endpoint Size Bits (EPSIZE2:0); 011 to set to 64 bytes - ; Endpoint Bank Bits (EPBK1:0); 00 to set One bank - ; Endpoint Allocation Bit (ALLOC); set to allocate the endpoint memory + ; SIZE OPTIMIZATION: r16 is already loaded with the required value while configuring endpoint 1 above std Y+oUECFG1X, r16 ; Store r16 to the USB Endpoint Configuration1 Register (UECFG1X); @@ -620,11 +747,10 @@ USB_Endpoint_ISR: ; Shorter version clr XH ; Clear XH Register ldi XL, 18 ; Load XL Register with number 18 (this will be used to refer to r18) - ldi r16, 8 ; Load r16 with number 8 (the number of fields we need to read) load: ldd r0, Y+oUEDATX ; Load r0 with the value in the USB Endpoint Data Register (UEDATX) st X+, r0 ; Store the value of r0 to the location pointed by X (r18), post increment X (X now points to r19) - dec r16 ; Decrement r16 - brne load ; Jump back to 'load' if r16 is not zero + cpi XL, 18+8 ; Compare XL with the location past the last byte that we need to read + brne load ; Jump back to 'load' if there are still bytes to read ; Our response is based on data direction... sbrc reg_bmRequestType, 7 ; Skip the next instruction if bit 7 of bmRequestType is not set; for host to device (OUT) transaction, bit 7 is cleared @@ -637,84 +763,49 @@ HOST_TO_DEVICE: cpi reg_bmRequestType, 0x00 ; Compare r18 (bmRequestType) with value 0x00 (OUT Type Resquest, USB Standard Request, Recipient is the device) breq HANDLE_USB_STANDARD_DEVICE ; If bmRequestType is 0x00, we know it's either a SET_ADDRESS or SET_CONFIGURATION request, so jump to HANDLE_USB_STANDARD_DEVICE - andi reg_bmRequestType, (0x60 | 0x1F) ; Mask reg_bmRequestType with the bits that define request type and recipient (CONTROL_REQTYPE_TYPE | CONTROL_REQTYPE_RECIPIENT) - cpi reg_bmRequestType, ((1 << 5) | (1 << 0)) ; Compare the masked value in r16 with the value that defines the request type and recipient we care about HID_SET_REPORT (REQTYPE_CLASS | REQREC_INTERFACE) - breq HANDLE_USB_CLAS_INTERFACE ; jump to HANDLE_USB_CLAS_INTERFACE - - rjmp UNHANDLED_SETUP_REQUEST ; If reg_bmRequestType is not 0x00 or bRequest is not 0x05 or 0x09, we don't handle those cases, so jump to UNHANDLED_SETUP_REQUEST - -HANDLE_USB_STANDARD_DEVICE: - - ; Once we know we support the OUT transaction, we need to filter it based on the value in bRequest - cpi reg_bRequest, 0x05 ; Compare bRequest with value 0x05 (REQ_SetAddress) - breq SET_ADDRESS ; jump to SET_ADDRESS - cpi reg_bRequest, 0x09 ; Compare bRequest with value 0x09 (REQ_SetConfiguration) - breq SET_CONFIGURATION ; jump to SET_CONFIGURATION - - rjmp UNHANDLED_SETUP_REQUEST ; If reg_bmRequestType is not 0x00 or bRequest is not 0x05 or 0x09, we don't handle those cases, so jump to UNHANDLED_SETUP_REQUEST - -HANDLE_USB_CLAS_INTERFACE: + cpi reg_bmRequestType, ((1 << 5) | (1 << 0)) ; Compare bmRequestType with the value that defines the request type and recipient we care about HID_SET_REPORT (REQTYPE_CLASS | REQREC_INTERFACE) + brne UNHANDLED_SETUP_REQUEST ; jump to UNHANDLED_SETUP_REQUEST if not equal + ; fallthrough to HANDLE_USB_CLASS_INTERFACE if equal +HANDLE_USB_CLASS_INTERFACE: cpi reg_bRequest, 0x09 ; Compare bRequest with value 0x05 (HID_REQ_SetReport) - breq SET_HID_REPORT ; jump to SET_HID_REPORT - rjmp UNHANDLED_SETUP_REQUEST ; If reg_bmRequestType is not 0x00 or bRequest is not 0x05 or 0x09, we don't handle those cases, so jump to UNHANDLED_SETUP_REQUEST - - -SET_ADDRESS: - - ; Set device address; for this we only need to copy the value in wValueL which contains the address - ; for the device set by the host to the USB Device Address Register (UDADDR); since the SET_ADDRESS - ; request is only executed once during enumeration, and because allowed address values are 1 through - ; 127 (7 LSBs), we don't need to care about the ADDEN bit (bit 7). We can also simply set the ADDEN - ; bit and store the value again in UDADDR to enable the USB Device Address. - - std Y+oUDADDR, reg_wValueL ; Store wValueL to the USB Device Address Register (UDADDR) - - rcall process_Host2Device ; This function affects r17 - - ; EnableDeviceAddress - ; UDADDR |= (1 << ADDEN) - ori reg_wValueL, _BV(ADDEN) ; In order to save space, we simply OR the address value already in reg_wValueL (r20) with the ADDEN bit to enable the USB Address - std Y+oUDADDR, reg_wValueL ; Store reg_wValueL to the USB Device Address Register (UDADDR) - - rjmp UNHANDLED_SETUP_REQUEST ; Go to UNHANDLED_SETUP_REQUEST - -SET_CONFIGURATION: - - rcall process_Host2Device ; This function affects r17 - - rjmp UNHANDLED_SETUP_REQUEST ; Go to UNHANDLED_SETUP_REQUEST - + brne UNHANDLED_SETUP_REQUEST ; If reg_bmRequestType is not 0x00 or bRequest is not 0x05 or 0x09, we don't handle those cases, so jump to UNHANDLED_SETUP_REQUEST + ; fallthrough to SET_HID_REPORT SET_HID_REPORT: - ; Acknowledge the SETUP packet - rcall clear_RXSTPI ; This function uses r17 to clear the RXSTPI bit in UEINTX - - ; Wait for command from the host - rcall wait_RXOUTI ; This function loads r17 with value of UEINTX + ; Acknowledge the SETUP packet and wait for command from the host + ldi r17, ~(_BV(RXSTPI)) ; Clear the Received SETUP Interrupt Flag (RXSTPI) in r17 + rcall clear_bit_and_wait_RXOUTI ; This function loads r17 with value of UEINTX load_page_address: - ; We store the page address in r15:r14 and not in r31:r30 because we need - ; to keep track of the page when we call write_page_to_flash - ldd r14, Y+oUEDATX ; Load r14 with LSB of page address - ldd r15, Y+oUEDATX ; Load r15 with MSB of page address + ldd r30, Y+oUEDATX ; Load r30 with LSB of page address + ldd r31, Y+oUEDATX ; Load r31 with MSB of page address check_page_address: - ldi r26, 0xFF ; Load value 0xFF to r26 - cp r26, r14 ; Compare low byte of page address against 0xFF - cpc r26, r15 ; Compare high byte of page address against 0xFF - brne erase_page ; if r15:r14 != 0xFFFF jump to erase_page - -quit_bootloader: - ; we received the START_APPLICATION command, change value of BootLoaderActive flag - clt ; clear the BootLoaderActive flag (T flag in SREG) - rjmp finish_hid_request ; jump to finish_hid_request + ; Protect against overwriting the bootloader - allow flash write only if the specified address is + ; less than the bootloader start address. Only the high byte needs to be tested, because the + ; bootloader start is guaranteed to be on a 256 bytes boundary. + cpi r31, hi8(reset_vector) ; Compare high byte of page address against the high byte of the bootloader start addresss + brcs erase_page ; If the address is below the bootloader start, allow the flash write operation + + ; The address is definitely not correct for a flash write operation; however, simply jumping to + ; finish_hid_request would not just fail this SET_HID_REPORT request - apparently not reading the + ; OUT data properly results in the bootloader not responding to any subsequent USB requests too. + ; Instead of doing that, we run the normal flash write loop even if the address was bad, but set + ; the "disable flash write" bit, so that the actual flash write instructions will be skipped. + ; Bit 7 of reg_bRequest is used for that purpose - is is known to be 0 in the normal case. + sbr reg_bRequest, _BV(7) ; Set the "disable flash write" bit + + ; If the address is out of the allowed range for flash write, it may be the special value for the + ; START_APPLICATION command (0xffff); check for that value in the shortest way possible. + adiw r30, 1 ; Increment the address to turn 0xffff into 0x0000 + brne erase_page ; If the address was out of range and not 0xffff, jump to the regular flash write code + ; (which would just consume the OUT data to make USB work properly). + clt ; Otherwise (the address was 0xffff) clear the BootLoaderActive flag (T flag in SREG), + ; then fallthrough to the regular flash write code too. erase_page: - ; Set page address in Z-Register - movw r30, r14 ; Copy r15:r14 to r31:r30 (Z-Register) - ldi r17, (_BV(PGERS)|_BV(SPMEN)) ; load r17 with the value needed to erase the currently specified page rcall do_SPM ; execute page erase (this function requires r17 to be loaded first with the right value for SPMCSR) @@ -732,11 +823,9 @@ check_endpoint_for_more_data: or r26, r26 brne fill_page_buffer ; if r26 is not zero, it means there's data in the endpoint which we can use to fill the page buffer, jump there - ; Acknowledge the OUT packet - rcall clear_RXOUTI ; This function uses r17 to clear the RXOUTI bit in UEINTX - - ; Wait for more data from the host - rcall wait_RXOUTI ; This function loads r17 with value of UEINTX + ; Acknowledge the OUT packet and wait for more data from the host + ldi r17, ~(_BV(RXOUTI)) ; Clear the Received OUT Data Interrupt Flag (RXOUTI) in r17 + rcall clear_bit_and_wait_RXOUTI ; This function loads r17 with value of UEINTX fill_page_buffer: ; There's data at the endpoint buffer, start fill_page_buffer sequence @@ -749,13 +838,15 @@ write_page_buffer: rcall do_SPM ; execute page buffer write (this function requires r17 to be loaded first with the right value for SPMCSR) increment_byte_address: - adiw r30, 2 ; Increment Z-Register (the current byte address) by 2 + subi r30, -2 ; Increment the current address by 2. + ; Only the low byte needs to be incremented, because the block start address must be page aligned, + ; therefore any carry to the high byte may happen only past the end of the block. dec r16 ; decrement r16 (number of words per page) brne check_endpoint_for_more_data ; loop while r16 is not equal to SPM_PAGESIZE (128) - ; Set page address in Z-Register - movw r30, r14 ; Copy r15:r14 (the original page address) back to r31:r30 (Z-Register) + ; Restore the page address in Z-Register + subi r30, SPM_PAGESIZE ; Move the address back to the start of page (again only the low byte needs to be changed). write_page_to_flash: ldi r17, (_BV(PGWRT)|_BV(SPMEN)) ; load r17 with the value needed to commit the current page buffer to the flash @@ -767,17 +858,74 @@ reenable_rww_section: finish_hid_request: - ; Acknowledge the OUT packet - rcall clear_RXOUTI ; This function uses r17 to clear the RXOUTI bit in UEINTX - - ; Wait for TXINI (OK to transmit) - rcall wait_TXINI ; This function loads r17 with value of UEINTX + ; Acknowledge the OUT packet and wait for TXINI (OK to transmit) + ldi r17, ~(_BV(RXOUTI)) ; Clear the Received OUT Data Interrupt Flag (RXOUTI) in r17 + rcall clear_bit_and_wait_TXINI ; This function loads r17 with value of UEINTX ; Clear Transmitter Ready Flag - rcall clear_TXINI ; This function uses r17 to clear the TXINI bit in UEINTX + ldi r17, ~(_BV(TXINI)) ; Clear the Transmitter Ready Interrupt Flag (TXINI) in r17 + rjmp clear_UEINTX_bit_and_reti ; Store r17 to the USB Endpoint Interrupt Register (UEINTX), then return from interrupt - rjmp UNHANDLED_SETUP_REQUEST ; Go to UNHANDLED_SETUP_REQUEST +HANDLE_USB_STANDARD_DEVICE: + + ; Once we know we support the OUT transaction, we need to filter it based on the value in bRequest + cpi reg_bRequest, 0x05 ; Compare bRequest with value 0x05 (REQ_SetAddress) + breq SET_ADDRESS ; jump to SET_ADDRESS + cpi reg_bRequest, 0x09 ; Compare bRequest with value 0x09 (REQ_SetConfiguration) + breq SET_CONFIGURATION ; jump to SET_CONFIGURATION + +UNHANDLED_SETUP_REQUEST: + + ; If we reach this part, the SETUP packet has not been handled, so we need to acknowledge it and request a stall + + ; Acknowledge the SETUP packet + ldi r17, ~(_BV(RXSTPI)) ; Clear the Received SETUP Interrupt Flag (RXSTPI) in r17 + std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) + ; STALL transaction + + ; // Endpoint_StallTransaction(); + ; UECONX |= (1 << STALLRQ); + ; Size optimization: We know that the only other bit that should be set in UECONX is EPEN, therefore + ; reading the current register value is not needed. + ldi r16, _BV(STALLRQ) | _BV(EPEN) ; Set the STALL Request Handshake Bit (STALLRQ) and EPEN in r16 + std Y+oUECONX, r16 ; Store r16 to the USB Endpoint Configuration Register (UECONX) + + reti ; Return from interrupt + +SET_CONFIGURATION: +#if defined(LED_ENABLED) + ; Turn LED on towards the end of enumeration (SET_CONFIGURATION is done after SET_ADDRESS) + ; TODO: If we ever have space, we could add a flag here to mark the fact that we have entered + ; this state, and turn the LED on at the end of the setup request. For now this is the best we + ; can do. + TURN_LED_ON +#endif + + ; Optimization by "sigprof" that saves 2 bytes + ; Dirty trick: We don't need to do anything for SET_CONFIGURATION except process_Host2Device, + ; so we reuse the SET_ADDRESS code by making it reload the same value to UDADDR. + + ldd reg_wValueL, Y+oUDADDR ; load the existing UDADDR value where the SET_ADDRESS code would expect the new address + +SET_ADDRESS: + + ; Set device address; for this we only need to copy the value in wValueL which contains the address + ; for the device set by the host to the USB Device Address Register (UDADDR); since the SET_ADDRESS + ; request is only executed once during enumeration, and because allowed address values are 1 through + ; 127 (7 LSBs), we don't need to care about the ADDEN bit (bit 7). We can also simply set the ADDEN + ; bit and store the value again in UDADDR to enable the USB Device Address. + + std Y+oUDADDR, reg_wValueL ; Store wValueL to the USB Device Address Register (UDADDR) + + rcall process_Host2Device ; This function affects r17 + + ; EnableDeviceAddress + ; UDADDR |= (1 << ADDEN) + ori reg_wValueL, _BV(ADDEN) ; In order to save space, we simply OR the address value already in reg_wValueL (r20) with the ADDEN bit to enable the USB Address + std Y+oUDADDR, reg_wValueL ; Store reg_wValueL to the USB Device Address Register (UDADDR) + + rjmp UNHANDLED_SETUP_REQUEST ; Go to UNHANDLED_SETUP_REQUEST ; IN transactions DEVICE_TO_HOST: @@ -785,81 +933,54 @@ DEVICE_TO_HOST: ; If we get here, we know bit 7 of bmRequestType is set, meaning it is a DEVICE_TO_HOST (IN) request, ; now we need to filter out any unhandled requests - cbr reg_bmRequestType, 0x01 ; We mask reg_bmRequestType with value 0x01, bit 0 of bmRequestType is set if the recipient of the request is the interface, - ; and we need to handle that case since the host will query the interface to retrieve the hid_descriptor, obviously we also - ; need to handle the recipient being the device (bit 0 = 0) since all other descriptors are targeted to it + ; SIZE OPTIMIZATION: The only bmRequestType values that we care about are: + ; - 0x80 - IN Type Request, USB Standard Request, Recipient is the device + ; - 0x81 - IN Type Request, USB Standard Request, Recipient is the interface + ; At this step it is known that bmRequestType >= 0x80, therefore checking for bmRequestType < 0x82 + ; is enough to detect whether bmRequestType has one of the above values. - cpi reg_bmRequestType, 0x80 ; Compare r18 (bmRequestType) with value 0x80 (IN Type Resquest, USB Standard Request, Recipient is the device/interface) - brne UNHANDLED_DEVICE_TO_HOST ; If bmRequestType is not 0x80, we know it's not a GET_DESCRIPTOR request, so jump to UNHANDLED_DEVICE_TO_HOST + cpi reg_bmRequestType, 0x82 ; Check whether bmRequestType is less than 0x82 (then it must be either 0x80 or 0x81) + brcc UNHANDLED_SETUP_REQUEST ; If bmRequestType >= 0x82, this request type is not handled here (it's not a GET_DESCRIPTOR request) cpi reg_bRequest, 0x06 ; Compare bRequest with value 0x06 (REQ_GetDescriptor) - breq GET_DESCRIPTOR ; jump to GET_DESCRIPTOR - -UNHANDLED_DEVICE_TO_HOST: - rjmp UNHANDLED_SETUP_REQUEST ; If reg_bmRequestType is not 0x80/0x81 or bRequest is not 0x06, we don't handle those cases, so jump to UNHANDLED_SETUP_REQUEST - + brne UNHANDLED_SETUP_REQUEST ; jump to UNHANDLED_SETUP_REQUEST if not equal + ; fallthrough to GET_DESCRIPTOR if equal GET_DESCRIPTOR: ; Just get the descriptor address into ; [RAMPZ:]Z, and the length into r16 - ; We know ALL descriptors are at the beginning of the bootloader, in the reset_vector space, - ; and by inspection we can determine that they all share the same high byte of the address (0x7EXX) - ldi ZH, 0x7E ; Load ZH with the most significant 8 bits of the descriptors address (0x7E) - - ; High byte of wValue for GET_DESCRIPTOR transactions specifies Descriptor Type - ; NOTE! We are skipping the comparison for 0x01 (Device Descriptor), since that can't really - ; be excluded, we simply assume that's the default to save space here. See @SAVE_SPACE below. - cpi reg_wValueH, 0x02 ; Compare high byte of wValue with value 2; - breq send_config_descriptor ; If high byte of wValue is 0x02 (Configuration Descriptor), jump to handle that - cpi reg_wValueH, 0x21 ; Compare high byte of wValue with value 0x21; - breq send_hid_descriptor ; If high byte of wValue is 0x21 (HID Class HID Descriptor), jump to handle that - cpi reg_wValueH, 0x22 ; Compare high byte of wValue with value 0x22; - breq send_hid_report_descriptor ; If high byte of wValue is 0x22 (HID Class HID Report Descriptor), jump to handle that - - ; If needed, include other descriptors here - - ; @SAVE_SPACE: I was able to comment this out and things still work, but it's probably bad (saves 6 bytes) - ; The following 2 lines are also dropped since we are skipping "rjmp UNHANDLED_SETUP_REQUEST" (osamuaoki) - ; cpi reg_wValueH, 0x01 ; Compare high byte of wValue with value 1; - ; breq send_device_descriptor ; If high byte of wValue is 0x01 (Device Descriptor), jump to handle that - ; NOTE: Originally, only this rjmp was skipped and things were still working, that's what - ; osamuaoki was able to use to optimize the check for (Device Descriptor), and simply fall through. - ; rjmp UNHANDLED_SETUP_REQUEST ; If the requested descriptor is not supported jump to UNHANDLED_SETUP_REQUEST - -send_device_descriptor: - ; We only load the lower portion (lo8) of the address of the descriptor, - ; the higher portion is common for all descriptors - ldi ZL, lo8(device_descriptor) ; Load ZL with the least significant 8 bits of device_descriptor - ldi r16, 18 ; Load r16 with length of device_descriptor (18 bytes) - rjmp process_descriptor ; jump to process_descriptor - -send_config_descriptor: - ; We only load the lower portion (lo8) of the address of the descriptor, - ; the higher portion is common for all descriptors - ldi ZL, lo8(config_descriptor) ; Load ZL with the least significant 8 bits of config_descriptor + ldi ZH, hi8(config_descriptor) ; Load the high address part of config_descriptor into ZH + ldi ZL, lo8(config_descriptor) ; Load the low address part of config_descriptor into ZL ldi r16, 34 ; Load r16 with length of config_descriptor (34 bytes) - rjmp process_descriptor ; jump to process_descriptor - -send_hid_descriptor: - ; We only load the lower portion (lo8) of the address of the descriptor, - ; the higher portion is common for all descriptors - ldi ZL, lo8(hid_descriptor) ; Load ZL with the least significant 8 bits of hid_descriptor - ldi r16, 9 ; Load r16 with length of hid_descriptor (9 bytes) - rjmp process_descriptor ; jump to process_descriptor - -send_hid_report_descriptor: - ; We only load the lower portion (lo8) of the address of the descriptor, - ; the higher portion is common for all descriptors - ldi ZL, lo8(hid_report_descriptor); Load ZL with the least significant 8 bits of hid_report_descriptor + cpi reg_wValueH, 0x02 ; Compare high byte of wValue with value 2; + breq process_descriptor ; If high byte of wValue is 0x02 (Configuration Descriptor), jump to handle that + adiw r30, hid_descriptor - config_descriptor ; Change Z to point to hid_descriptor + cpi reg_wValueH, 0x21 ; Compare high byte of wValue with value 0x21 (HID Class HID Descriptor) + ; The following code will also be reused for the device descriptor - both of these descriptors + ; contain the size in the first byte, and getting the size from there saves one instruction. This + ; trick cannot be applied to the Configuration Descriptor (which is actually a collection of + ; multiple descriptors) and the HID Report Descriptor (which has a completely different format). +process_single_descriptor: + lpm r16, Z ; Load r16 with the first byte of descriptor, which contains its length in bytes. + ; This instruction does not change any flags in SREG, therefore it can be placed + ; between the compare and the corresponding conditional jump. + breq process_descriptor ; If the last compare result was equal, jump to return the descriptor data. + adiw r30, device_descriptor - hid_descriptor ; Change Z to point to device_descriptor + cpi reg_wValueH, 0x01 ; Compare high byte of wValue with value 1; + breq process_single_descriptor ; If high byte of wValue is 0x01 (Device Descriptor), jump to handle that; + ; reuse the code for hid_descriptor above. + adiw r30, hid_report_descriptor - device_descriptor ; Change Z to point to hid_report_descriptor ldi r16, 21 ; Load r16 with length of hid_report_descriptor (21 bytes) - - ; If needed, include other descriptors here + cpi reg_wValueH, 0x22 ; Compare high byte of wValue with value 0x22; + brne UNHANDLED_SETUP_REQUEST ; If high byte of wValue is NOT 0x22 (HID Class HID Report Descriptor), reject the setup request; + ; otherwise fallthrough to process_descriptor. process_descriptor: ; Acknowledge the SETUP packet - rcall clear_RXSTPI ; This function uses r17 to clear the RXSTPI bit in UEINTX + ldi r17, ~(_BV(RXSTPI)) ; Clear the Received SETUP Interrupt Flag (RXSTPI) in r17 + std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) verifyMaxDescriptorLength: cp reg_wLengthL, r16 ; Compare the value in r24 (wLengthL) against the value in r16 (length of descriptor to send) @@ -892,54 +1013,23 @@ transfer_descriptor: send_packet_done: ; Clear Transmitter Ready Flag - rcall clear_TXINI ; This function uses r17 to clear the TXINI bit in UEINTX + ldi r17, ~(_BV(TXINI)) ; Clear the Transmitter Ready Interrupt Flag (TXINI) in r17 + std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) ; Wait for the host to send an OUT packet (RXOUTI to assert), but abort if a SETUP packet is received wait_finish_transfer: ldd r17, Y+oUEINTX ; Load r17 with the most current value in the USB Endpoint Interrupt Register (UEINTX); - sbrs r17, RXOUTI ; Skip the next instruction if the Received OUT Data Interrupt Flag (RXOUTI) is set (there's already an OUT packet from the host), go to acknowledge_rxouti - sbrc r17, RXSTPI ; Skip the next instruction if the Received SETUP Interrupt Flag (RXSTPI) is not set; no need to abort, we haven't received another SETUP packet, we can keep looping - rjmp acknowledge_rxouti ; Jump if either RXOUTI or RXSTPI are set - rjmp wait_finish_transfer ; Loop back to finish_transfer until either Received OUT Data Interrupt Flag (RXOUTI) or Received SETUP Interrupt Flag (RXSTPI) is set - -acknowledge_rxouti: - - ; We could have gotten here if we got out of the previous loop (wait_finish_transfer) if either RXOUTI or RXSTPI asserted, since RXSTPI has the HIGHEST priority, - ; we check for it here first, to decide whether or not we need to abort - - ; Abort if RXSTPI is set - ; NOTE: R17 already has the most current value of UEINTX, no need to load it again sbrc r17, RXSTPI ; Skip the next instruction if the Received SETUP Interrupt Flag (RXSTPI) is cleared reti ; Return if RXSTPI is set, we need to prioritize SETUP packets + sbrs r17, RXOUTI ; Skip the next instruction if the Received OUT Data Interrupt Flag (RXOUTI) is set (there's already an OUT packet from the host) + rjmp wait_finish_transfer ; Loop back to finish_transfer if none of RXSTPI or RXOUTI flags are set ; Acknowledge the OUT packet - rcall clear_RXOUTI ; This function uses r17 to clear the RXOUTI bit in UEINTX - -UNHANDLED_SETUP_REQUEST: - - ; if (Endpoint_IsSETUPReceived()) - ; (UEINTX & (1 << RXSTPI)) - ldd r16, Y+oUEINTX ; Load r16 with the value in the USB Endpoint Interrupt Register (UEINTX); - sbrs r16, RXSTPI ; Skip the next instruction if the Received SETUP Interrupt Flag (RXSTPI) is set; received SETUP packet? - reti ; Return if RXSTPI is not set, SETUP packet already handled - - ; If we reach this part, the SETUP packet has not been handled, so we need to acknowledge it and request a stall + ldi r17, ~(_BV(RXOUTI)) ; Clear the Received OUT Data Interrupt Flag (RXOUTI) in r17 - ; Acknowledge the SETUP packet - rcall clear_RXSTPI ; This function uses r17 to clear the RXSTPI bit in UEINTX - - ; STALL transaction - - ; // Endpoint_StallTransaction(); - ; UECONX |= (1 << STALLRQ); - ldd r16, Y+oUECONX ; Load r16 with the value in the USB Endpoint Configuration Register (UECONX) - ori r16, _BV(STALLRQ) ; Set the STALL Request Handshake Bit (STALLRQ) in r16 - std Y+oUECONX, r16 ; Store r16 to the USB Endpoint Configuration Register (UECONX) - - -EP_ISR_END: - - reti ; Return from interrupt +clear_UEINTX_bit_and_reti: + std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) + rjmp UNHANDLED_SETUP_REQUEST ; Go to UNHANDLED_SETUP_REQUEST ; ================================================================= @@ -948,14 +1038,20 @@ EP_ISR_END: set_watchdog_timer: - ; IMPORTANT!! This function assumes the correct values for the WDTCSR register - ; configuration are already loaded onto r16 and 17. + ; IMPORTANT!! This function assumes the correct value for the WDTCSR register + ; configuration is already loaded onto r17; it also modifies r16. + + ; always set YH to hi(EIO_BASE) before calling + ldi YL, lo8(EIO_BASE) ; Load YL with EIO_BASE (wdt_init) wdr ; Reset the Watchdog Timer + ldi r16, _BV(WDCE) | _BV(WDE) ; Load r16 with the value needed to "unlock" the Watchdog Timer Configuration + ; Write a logic one to the Watchdog Change Enable bit (WDCE) and Watchdog System Reset Enable (WDE) + std Y+oWDTCSR, r16 ; Store r16 to the Watchdog Timer Control Register (WDTCSR) + ; Load the desired configuration to the Watchdog Timer Control Register (WDTCSR) - sts WDTCSR, r16 ; Store r16 to the Watchdog Timer Control Register (WDTCSR) - sts WDTCSR, r17 ; Store r17 to the Watchdog Timer Control Register (WDTCSR) + std Y+oWDTCSR, r17 ; Store r17 to the Watchdog Timer Control Register (WDTCSR) ret @@ -967,20 +1063,14 @@ process_Host2Device: ; NOTE: All the functions here affect r17 - ; Acknowledge the SETUP packet - rcall clear_RXSTPI ; This function uses r17 to clear the RXSTPI bit in UEINTX - - ; Wait for TXINI (OK to transmit) - rcall wait_TXINI ; This function loads r17 with value of UEINTX - - ; Clear Transmitter Ready Flag - rcall clear_TXINI ; This function uses r17 to clear the TXINI bit in UEINTX - - ; SIZE OPTIMIZATION: Fall through to wait_TXINI instead of rcall'ing it - ; Wait for TXINI (OK to transmit) - ; rcall wait_TXINI ; This function loads r17 with value of UEINTX - ; ret ; Return from call + ; Acknowledge the SETUP packet and wait for TXINI (OK to transmit) + ldi r17, ~(_BV(RXSTPI)) ; Clear the Received SETUP Interrupt Flag (RXSTPI) in r17 + rcall clear_bit_and_wait_TXINI ; This function loads r17 with value of UEINTX + ; Clear Transmitter Ready Flag and wait for TXINI (OK to transmit) + ldi r17, ~(_BV(TXINI)) ; Clear the Transmitter Ready Interrupt Flag (TXINI) in r17 +clear_bit_and_wait_TXINI: + std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) wait_TXINI: ; NOTE: This function uses r17, we can use this fact to code other stuff @@ -994,39 +1084,8 @@ wait_TXINI: ret ; Return from call -clear_RXSTPI: - - ; NOTE: This function affects r17 - - ; Acknowledge the SETUP packet - ldi r17, ~(_BV(RXSTPI)) ; Clear the Received SETUP Interrupt Flag (RXSTPI) in r17 - std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) - - ret ; Return from call - - -clear_TXINI: - - ; NOTE: This function affects r17 - - ; Clear Transmitter Ready Flag - ldi r17, ~(_BV(TXINI)) ; Clear the Transmitter Ready Interrupt Flag (TXINI) in r17 - std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) - - ret ; Return from call - - -clear_RXOUTI: - - ; NOTE: This function affects r17 - - ; Acknowledge the OUT packet - ldi r17, ~(_BV(RXOUTI)) ; Clear the Received OUT Data Interrupt Flag (RXOUTI) in r17 +clear_bit_and_wait_RXOUTI: std Y+oUEINTX, r17 ; Store r17 to the USB Endpoint Interrupt Register (UEINTX) - - ret ; Return from call - - wait_RXOUTI: ; NOTE: This function uses r17, we can use this fact to code other stuff @@ -1044,7 +1103,12 @@ do_SPM: ; NOTE: This function assumes r17 already has the correct value for the SPMCSR register, depending on the ; desired SPM operation + ; NOTE: If bit 7 of reg_bRequest is set to 1, the actual SPM instruction will not be executed + ; (the wait loop will still run, but should just complete immediately). out _SFR_IO_ADDR(SPMCSR), r17 ; store value in r17 to the Store Program Memory Control and Status Register (SPMCSR) + sbrs reg_bRequest, 7 ; Skip the actual flash operation if the "disable flash write" bit is set. + ; This is apparently safe, because the SPM instruction must be executed within 4 cycles after setting SPMEN, + ; and the sbrs instruction takes just 1 cycle when not skipping. spm ; execute spm instruction based on the value loaded to SPMCSR wait_SPM: