This is a data collection and pre-processing exercise. Our project is based on two synthetic data files. Using the widely adopted and powerful NumPy and pandas technology stack, we perform data loading, cleaning, and aggregation on these datasets.
Group: Group1
GitHub Link: https://github.com/chence/DataCollectionPreProcessing.git
Team Members:
- Ce Chen | 9007166
- Zhuoran Zhang | 9048508
$ python -m venv .venv $ source .venv/bin/activate (.venv)$ pip install -r requirements.txt(.venv)$ quarto render lab2.ipynb --to pdf --executeSince the customer_id field was not found in the provided download link, the data file used in this project was synthesized using ChatGPT.