Skip to main content
26 events
when toggle format what by license comment
Jun 6, 2017 at 6:53 comment added b3m2a1 @Szabolcs I know this is question is mad old, but it turns out WDX can support access to certain explicit positions. I dredged this up when invesitgating how to work with data paclets. See this: mathematica.stackexchange.com/a/146139/38205 for a quick rundown of the layout. Unless, of course, you already knew this and there's a subtlety I'm missing. If so please do let me know because it's always good to learn these things.
May 23, 2017 at 12:35 history edited CommunityBot
replaced http://stackoverflow.com/ with https://stackoverflow.com/
Feb 3, 2016 at 21:45 comment added Athanassios Sure, I understand, and many thanks for sharing this with the rest of us. As I said playing at the data model level is certainly easier, but you have to rely on the DBMS data storage. This is how I have been researching solutions on data modeling. I believe Mathematica has to be enhanced with a similar data structure, in-memory processing as that of Qlikview. That will boost popularity and it will make it super efficient with large volumes of data. Then you only have to combine this with a similar type DBMS.
Feb 3, 2016 at 21:30 comment added Leonid Shifrin @Athanassios Well, I wasn't setting too ambitious goals for this answer, I just tried to get a minimal framework to address the basic needs specific to Mathematica workflows.
Feb 3, 2016 at 21:17 comment added Athanassios @LeonidShifrin answer and comments like the one from telefunkenvf14 suggest that the direction of research for such problems is that of database technology. We are in the era of NoSQL databases and if you are not going to reinvent the wheel on the I/O low-level details of such a DBMS then at a higher level you have to switch the way you normally think, i.e from records (rows) to fields (columns) to values (cells). In data modeling terms the problem you face is that of redundancy. Single-instance storage and associative technology like that in Qlikview are in the right direction.
Sep 3, 2015 at 21:39 history edited Leonid Shifrin
edited tags
S Dec 12, 2012 at 17:15 history bounty ended whuber
S Dec 12, 2012 at 17:15 history notice removed whuber
Dec 5, 2012 at 22:16 comment added Szabolcs @Chris Unfortunately that is very very slow with large data, and also produces huge files.
Dec 5, 2012 at 16:36 comment added Chris Degnen Rather than DumpSave["mydata.mx", mydata] try Save["mydata.sav", mydata]. I find it very useful.
S Dec 5, 2012 at 16:28 history bounty started whuber
S Dec 5, 2012 at 16:28 history notice added whuber Reward existing answer
Dec 3, 2012 at 11:44 history edited Mechanical snail
edited tags
Jan 25, 2012 at 23:56 history tweeted twitter.com/#!/StackMma/status/162323088837591042
Jan 25, 2012 at 17:33 vote accept Szabolcs
Jan 18, 2012 at 20:36 answer added Leonid Shifrin timeline score: 116
Jan 18, 2012 at 4:33 history edited Mike Bailey
edited tags
Jan 18, 2012 at 1:22 comment added Mike Honeychurch @Szabolcs as an FYI I had previously stored all my (economic/financial) data as WDX prior to switching it over to MySQL a couple of years ago. It has made life so much easier now to update and retrieve. For your problem it seems to me that databases are designed for these sorts of tasks. Also see Sal Mangano's talk about kdb+ if extracting columns rather than rows is better for what you specifically want to do. The advantages appear to be many orders of magnitude speed enhancement. I don't have links handy but should be easy to find.
Jan 18, 2012 at 1:18 comment added acl @MikeB that doesn't scale though, and is inconvenient. I generally do the same (having access to machines with 512GB of RAM helps), but I'd like to know how to do things in a more reasonable way.
Jan 18, 2012 at 0:14 comment added Szabolcs @MikeHoneychurch You're right, I only need chunks. I asked about such a large amount of data to get a good feel about the loading speed. Loading the data will hopefully not be the bottleneck. (Compare WDX loading speed to MX, there's huge difference)
Jan 17, 2012 at 23:31 comment added Mike Bailey This is one of the times that I've taken the brute force approach: Just throw more memory at the problem. On my last update I upgraded my machine to 12 GB of RAM. On some simulations I was running it was easily eating up in excess of 2 GB per kernel (= 8 GB total). It would be convenient to have some way of streaming data in and out of kernels though.
Jan 17, 2012 at 23:20 comment added Mike Honeychurch @Szabolcs I haven't worked with stuff that large and I would imagine that you will run into Mma limitations. From your background and question I thought you only wanted to bring into Mma "chunks" of data on demand. In other words do you really need an entire 2GB or can you do some SQL operations to pick out what you need?
Jan 17, 2012 at 23:12 comment added Szabolcs @Mike I have never done that, it's good to hear experiences in how well that works. E.g. how long would it take to load 2 GB of data into Mathematica, compared to MX files?
Jan 17, 2012 at 23:11 comment added Mike Honeychurch Personally I find life so much easier by having data in a database and linking to Mma.
Jan 17, 2012 at 22:57 comment added Simon Using databases in Mathematica was discussed in Using Mathematica in MySQL databases. I know that QLink is used in some fairly large Feynman diagram calculations...
Jan 17, 2012 at 22:36 history asked Szabolcs CC BY-SA 3.0