That's a usual way to an indirect JSR with a 6502. The 6502 does not support indirect subroutine calls (*1), so it has to be done in software. Indirect subroutine calls are a useful tool for function calls into OS/library functions which may change during runtime or by configuration - like when redirecting output to a different driver. By using a routine pointer for certain calls it's easy to overload/replace them by just changing that pointer (*2).
Lacking the indirect call the 6502 needs to emulate an indirect subroutine call in software by calling a subroutine which in turn pushes the pointer onto the stack (high first) and then jumping there by 'returning' to it. Adds some cycles, but also preserves the flexibility (*3).
In detail it works like this:
LDA ptr+1 * high byte of target routine pointer PHA * push down the stack LDA ptr * highlow byte of target routine pointer PHA * push down the stack rtsRTS * 'returning' to the address at TOS The TAX/TXA around is just to preserve the parameter (character to be printed) around the stack handling code.
The NES-Dev Wiki offers a nice page about this topic.
Above routine is in itself a waste of time (23 cycles) and code (9 bytes) compared to a JSR pointing to an indirect jump. Just when the OS table is, like in this case, prepared for being executed using this, it will hold the routine addresses minus one, so an indirect jump won't work.
Further, I'm not so sure that just grabbing the routine from IOCB 0 is a great idea. While it should work, as IOCB 0 is usually associated with the screen, it's definitely not fault resistant. It might be way better to go thru the CIO first.
(Caveat: My Atari knowledge is only small and rather rusty)
*1 - One of the few really missing instructions that could have been added rather easily. And a major hint that the 6502 wasn't designed with a general purpose CPU in mind, but rather a microcontroller with its fixed address locations, where such redirection is done during compile/linkage time.
*2 - Always keep in mind to use target address minus one, as RTS will increment the address before fetching the next instruction.
*3 - Some OS did speed up this by putting a JMP-opcode in front of every callable pointer, allowing a user programm to just JSRing via the pointer-1 address, greatly reducing the overhead to 3 cycles.