Take a look at the 63 bit PTP timer register inside the STM32F407 Ethernet MAC block - you can persuade it to generate pulses which are a little jittery (e.g. about 20nS) but have an average period settable to a fraction of one nanosecond.
It implements a fractional-n division of e.g. a 52MHz update clock.
It is possible to trick it into generating interrupts/pulses at fairly high rates, certainly up to 10kHz, if you keep updating the count compare register with the next count value every time it pulses.
I was able to use it to generate a precision 1PPS in a simple NTP server, where I software phase locked it to a 1PPS from a uBlox GPS receiver, so that when GPS went down, it would count on accurately for a reasonable length of time.
Basically you enable the Precision Time Protocol in the MAC block, you can carry on using the MAC for Ethernet at the same time , it simply sends some extra IEEE-1588 bytes every time it sends a frame over the wire.
I believe that the MAC block in the STM32F407 is an off-the shelf Synopys IP core for the ARM ecosystem so other ARM based microcontrollers with Ethernet and IEEE-1588 may be able to do the trick.
ALTERNATIVE: Directly use the fractional-N concept in programming the period of the divider. In simple terms, if you want to program an integer divider to divide by say 12.234, you tell it to divide by 12 for 1000-234 = 766 times, then tell it to divide by 13 for 234 times. This will average out to be the correct ratio but of course there will be a strong frequency modulation.
A better way is to calculate on each cycle whether the next pulse of the output signal should be closer to the N'th clock or (N+1)th clock from the previous clock pulse, and set the division ratio appropriately, basically you accumulate the time error in an accumulator and when the error exceeds one cycle of the timer clock, you program a N+1 in the reload register and subtract one clock period from the time error accumulator.
This also produces FM tones and squeaks. Noise shaping is possible where you start adding random values to the N/N+1 decision system that average out to zero, but which spread the "tones" across the spectrum to become lower level broad band noise.
The next level up is to use a DAC, sampling at over 2x the 10kHz max frequency, and directly generate the sine wave values you would get for each sample, same as a DDS chip.
At the extreme this turns into a 1 bit DAC clocked in the MHz region , where the spectrum of the signal round DC includes your wanted signal, but up in the above audio region there is a lot of noise produced which you filter out with a simple analog filter. Some microcontrollers have these building blocks in the chip, so you can get a "nice" 48kHz sampled version of your up to 10Khz sinewave.