I need to work out the architecture for a NASDAQ frontend charting application (a desktop app in .Net). Note that this is NOT for real-time quotes.
NASDAQ provides an api that gives historical pricing, limited to one year's data, which is fine for our purposes.
First I use that API to get the data (it comes in csv files). I store those files in an S3 bucket.
Then I use AWS Glue (a cloud ETL tool) to move the data into Redshift a cloud DB (see EDIT below).
Meanwhile I setup a Lambda function that runs at the end of each day (or say 01:00 AM) to get the price for each ticker for the day that just ended, and that function adds it to the historical data in Redshift the cloud DB (see EDIT below).
Finally I create a separate Lambda function that the desktop app can call for any ticker symbol, and that function queries the database to return all the data up to yesterday (which is now in the DB).
Questions re this backend architecture
Is this a good use case for Lambda and the AWS API Gateway? And does the way I have laid it out make sense?
would it make more sense to use Python FastAPI on an EC2 instance to query Redshift and provide a GET endpoint for that? (although I'd be concerned about how to scale that to support more clients)
Questions re frontend app
A key question on the frontend is should the Desktop app store locally the data for each ticker the user requests (and subsequently only download the days it does not have)?
eg. 1 user downloads Apple prices on Monday 2 they close the app 3 they open app again on Thursday and asks to see Apple again,
So should the desktop have stored the previous data locally, or just make a new request to get it all again?
That seems wasteful, would it make more sense for the Desktop App to see if it has Apple data locally up to Monday, and then only request Tue/Wed EOD prices?
EDIT
Given the comment about the need to use Redshift, I did some further research and now I know that AWS Glue can directly query CSV files in S3 using AWS Athena.
So I guess Redshift is not required, the Lambda function that API Gateway calls can just call a query using Glue (I think, right?)
EDIT 2
Although I just learned that AWS Athena is not an option as it's queries are queued, not on demand, so it could take several minutes to respond.
So I'm still under the impression that we need to first store the data in an actual database for API Gateway to be able to query it quickly. Maybe Amazon Aurora?