Skip to content

Israelsmmx/PandasReferenceSamples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pandas Reference & common functions

@autor Israel Sanchez @Last update 06/24/2023 @tested on docker containers running over windows

Requisites (windows)

  • Docker Desktop
  • WSL 2 installed
  • Docker hub account
  • Internet access
  • Energy to start :) lol

Preparatives with Docker Containers

Jupyter with Python

 docker run -p 8888:8888 --name jupyter -v "${PWD}":/home/jovyan/work jupyter/datascience-notebook 

SQL Server

 docker run --name mssql2022 -v "${PWD}":/tmp -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=M1dNigt3ss" -p 1433:1433 -d mcr.microsoft.com/mssql/server:2022-latest 

SQL Server SA user pass: M1dNigt3ss -- Remember this

image


Preparatives for SQL SErver after startup

  • Download adventureworks BAK file from Microsoft repo

    <https://learn.microsoft.com/en-us/sql/samples/adventureworks-install-configure?view=sql-server-ver16&tabs=ssms

    image

  • Check the container ID in

     docker ps 

    krump@KrumbaRumba:~/code/$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9fba1dd731fd jupyter/datascience-notebook "tini -g -- start-no…" 18 hours ago Up 23 minutes (healthy) 0.0.0.0:8888->8888/tcp jupyter 17c83a1327c9 mcr.microsoft.com/mssql/server:2022-latest "/opt/mssql/bin/perm…" 18 hours ago Up 23 minutes 0.0.0.0:1433->1433/tcp mssql2022

  • Copy the AdventureWorks .bak file to the container

     docker cp AdventureWorksDW2022.bak 17c83a1327c9:/tmp 
  • Use SQL Server client list MSSQL Studio or Azure Data Explorer to restore AdventureWorksDW2022 from .BAK File located on /tmp/AdventureWorksDW2022.bak

Preparatives for Jupyter Container to install SQL SERVER ODBC Libraries

 sudo docker exec -it --user root jupyter bash 

in jupyter container as root run the next code (looks to many lines I'm not expert but this install ODBC Client 18) & sync the dependencies locally

 apt update -y apt upgrade -y apt install gpg -y #Download the desired package(s) curl -O https://download.microsoft.com/download/1/f/f/1fffb537-26ab-4947-a46a-7a45c27f6f77/msodbcsql18_18.2.2.1-1_amd64.apk curl -O https://download.microsoft.com/download/1/f/f/1fffb537-26ab-4947-a46a-7a45c27f6f77/mssql-tools18_18.2.1.1-1_amd64.apk #(Optional) Verify signature, if 'gpg' is missing install it using 'apk add gnupg': curl -O https://download.microsoft.com/download/1/f/f/1fffb537-26ab-4947-a46a-7a45c27f6f77/msodbcsql18_18.2.2.1-1_amd64.sig curl -O https://download.microsoft.com/download/1/f/f/1fffb537-26ab-4947-a46a-7a45c27f6f77/mssql-tools18_18.2.1.1-1_amd64.sig curl https://packages.microsoft.com/keys/microsoft.asc | gpg --import - apt update -y gpg --verify msodbcsql18_18.2.2.1-1_amd64.sig msodbcsql18_18.2.2.1-1_amd64.apk gpg --verify mssql-tools18_18.2.1.1-1_amd64.sig mssql-tools18_18.2.1.1-1_amd64.apk apt update -y apt upgrade -y apt-get install apt-file -y apt-file update apt upgrade -y apt-file find msodbcsql18_18.2.2.1-1_amd64.apk apt-file find mssql-tools18_18.2.1.1-1_amd64.apk #Install the package(s) apt install -y unixodbc-dev -y apt install -y unixodbc -y apt update -y apt upgrade -y apt install gpg -y sudo apt install --reinstall software-properties-common -y apt-get install odbcinst -y sudo curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - echo "deb [arch=amd64] https://packages.microsoft.com/ubuntu/21.10/prod impish main" | sudo tee /etc/apt/sources.list.d/mssql-release.list sudo add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/22.04/prod.list)" apt update -y sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 apt install msodbcsql18 -y 

Common LIbraries

Azure (future samples)

 pip install azure-core pip install azure-mgmt-compute pip install azure-mgmt-containerservice pip install azure-mgmt-containerinstance pip install azure-storage-blob pip install azure-keyvault pip install azure-mgmt-storage pip install azure.storage.blob pip install azure-storage-file-share pip install azure-storage-common pip install azure-mgmt-datalake-store pip install azure-mgmt-databricks pip install azure-data-tables pip install azure-mgmt-synapse pip install azure-mgmt-monitor 

Misc

 pip install imageio pip install prophet pip install xgboost pip install matplotlib pip install seaborn pip install sklearn 

SQL / Hana / pandas

 pip install sqlalchemy pip install pyodbc pip install pandas pip install pyarrow pip install sqlalchemy-hana pip install hdbcli 

Spark (future samples)

 pip install pyspark pip install ipython-sql pip install sparksql-magic pip install delta-spark 


Initial libraries as extentions

  • Cargar_datos_excel_pandas.ipynb

Sample files for most common functions with Pandas

  • 000 datos_faltantes.ipynb

  • 000 filtrar_datos_dataframes.ipynb

  • 000 series_de_pandas.ipynb

  • 0001 dataframes_de_pandas.ipynb

  • 001 tutorial-limpieza-de-datos.ipynb

  • 002 tutorial-analisis-exploratorio-de-datos.ipynb

  • 0_FunctionIncludes.ipynb

  • 500 Ejercicio Distribution info SAP TABLES.ipynb

  • 500 Ejercicio Optimize DataFrameSize.ipynb

  • 500 Sample load from CSV and optimization of memory.ipynb

    Optimize the use of memory

  • 501 Conectar con SQL Server.ipynb

  • 502 Pandas DataFrame JOINS.ipynb

    joins dataframes using merge functionallity from pandas

  • 503 frequently algoritms used on pandas.ipynb

  • 600 SQL Server Adventure Works SAMPLES.ipynb

  • 602 SQL SERVER ADVENTURE ALL Process.ipynb

    read tables from SQL Server and execute filters and merges

  • 991_Extraer datos de SAPHana.ipynb



About

Pandas Reference samples for common functionality including connect with common databases

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors