12

I have a really large table need to calculate (10 billion + rows). It's too long to calculate all of them at one time. Therefore, I create a list of break points, and I calculate and store at the end of each step. However, each step costs longer time. I think the reason is the memory or cache, do you have any idea how to deal with in this situation or just how to clear cache or memory?

I reuse the variable for output in the loop, so the variable will not become bigger and bigger.

Thank you so much.

7
  • Where do you store them? In a file or database? Commented Apr 17, 2017 at 21:19
  • 1
    Cache is irrelevant, no software outside maybe the OS directly messes with that (cache is hardware-driven, not software). Commented Apr 17, 2017 at 21:20
  • How does the mentioned large table implemented? Is it a database table or a dictionary? Commented Apr 17, 2017 at 21:20
  • store into SQL server. @AlexHall Commented Apr 17, 2017 at 21:22
  • It's retrieved from SQL server, and transformed to pandas.DataFrame, And then calculated and stored into SQL server. @ImrePiller Commented Apr 17, 2017 at 21:24

1 Answer 1

8

I use to solve this issue using a line that reset the variable at the end of the process, cleaning the cache:

MyVariable = None 
Sign up to request clarification or add additional context in comments.

1 Comment

How is that different from del MyVariable in terms of cleaning the cache?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.