Skip to main content

Timeline for June 2023 Data Dump is missing

Current License: CC BY-SA 4.0

81 events
when toggle format what by license comment
S Jun 22, 2023 at 12:06 history bounty ended CommunityBot
S Jun 22, 2023 at 12:06 history notice removed user152859
Jun 18, 2023 at 11:27 comment added user152859 @AaronBertrand good to see you're still around. But I'm really concerned, both about the data dumps (management can snipe finger and shut them down again any moment), and SE in general, as it's sinking now into the AI depths.
S Jun 18, 2023 at 11:22 history bounty started CommunityBot
S Jun 18, 2023 at 11:22 history notice added user152859 Reward existing answer
S Jun 18, 2023 at 10:43 history bounty ended Random Person
S Jun 18, 2023 at 10:43 history notice removed Random Person
S Jun 17, 2023 at 6:47 history bounty started Random Person
S Jun 17, 2023 at 6:47 history notice added Random Person Reward existing answer
Jun 17, 2023 at 3:20 history edited V2BlastStaffMod
edited tags
Jun 17, 2023 at 2:21 comment added Aaron Bertrand Staff ^ Concur - I wouldn't have wanted to babysit this process without some of the great enhancements @PatrickHurst and Andy have made, especially earlier this year.
S Jun 16, 2023 at 17:55 history bounty ended Resistance Is Futile
S Jun 16, 2023 at 17:55 history notice removed Resistance Is Futile
Jun 16, 2023 at 0:26 comment added Patrick Hurst @ShadowWizardStrikesBack there were some bumps in the road, but as mentioned, there was a lot of good work done that made the process much more observable and resilient recently.
Jun 15, 2023 at 7:37 history edited Random Person CC BY-SA 4.0
Removed user id and added employment status
Jun 14, 2023 at 18:04 history edited NoDataDumpNoContribution CC BY-SA 4.0
linked to answers containing inside information from the company or ex-employee of company so visitors can later easier navigate
Jun 14, 2023 at 17:56 comment added NoDataDumpNoContribution @DataDude Please change the acceptance mark to the post by Philippe. It's also the newest official post about the topic.
Jun 14, 2023 at 17:54 comment added NoDataDumpNoContribution @wimi Of course the right answer should be accepted. I will remind the OP. And maybe I should also edit the official posts to clearly mark them as official. I think the number of votes and comments show that the official answers aren't forgotten. Believe me, a single question is better if done right.
Jun 14, 2023 at 17:33 comment added wimi @NoDataDumpNoContribution now you got your way: the answer on the other post has been deleted and the accepted answer here says "the data dump is disabled", which is not true. The strike update also says nothing about the reenablement. I still think this is malicious misleading of the users, but whatever, it seems that SE is fine with it.
Jun 14, 2023 at 13:25 history post merged (destination)
Jun 14, 2023 at 13:18 comment added ꓢPArcheon @NoDataDumpNoContribution "How will anyone even later find out what really happened"... what if that was exactly the thing to prevent?
Jun 14, 2023 at 13:04 answer added ꓢPArcheon timeline score: 8
Jun 14, 2023 at 9:24 comment added NoDataDumpNoContribution @wimi "I would never have seen the announcement there." Why not? Many others seem to see it there and it would have been accepted. The question was "Where are the June data dumps?" and the accepted answer would say "Here they are?". Better than having two questions about it in order to see the accepted answer faster. How will anyone even later find out what really happened? We don't pin accepted answers anymore. This is maybe a case where one sees the drawbacks of this.
Jun 14, 2023 at 9:20 comment added starball Mod @wimi related: Show MSE posts authored by staff members on the homepage regardless of the vote count
Jun 14, 2023 at 8:30 comment added NoDataDumpNoContribution Regarding that part "Will the company continue to provide them going forward?" of the question, the only possible answer is that nobody knows. This part of the question is as useful as asking "Will AI take over the world?"
Jun 14, 2023 at 6:31 answer added NoDataDumpNoContribution timeline score: 15
Jun 14, 2023 at 4:15 comment added NoDataDumpNoContribution I don't understand the strategy of opening up or not opening up of new questions by the company. Sometimes they hijack their old questions and update them with new information, now they open up a new question with the same content as an old question. They should know how this Q&A system is supposed to work.
Jun 13, 2023 at 23:51 comment added Starship I suppose but its sure not planned anymore @AMtwo
Jun 13, 2023 at 23:48 comment added Thomas Markov @SonictheAnonymousHedgehog It’s good to see that was already done.
Jun 13, 2023 at 23:37 comment added Starship Should this be changed to status-completed
Jun 13, 2023 at 23:11 answer added kaya3 timeline score: 47
Jun 13, 2023 at 23:07 answer added starballMod timeline score: 27
Jun 13, 2023 at 23:04 comment added Sébastien Renauld Alternatively, you could rephrase this to be the actual statement, with additional info and prompt for questions, Q&A style. Thomas Owens and kaya3 had on-point concerns that probably could be expanded here, rather than a self-answer that is a mirror of another question.
Jun 13, 2023 at 22:32 answer added PhilippeStaffMod timeline score: 124
Jun 11, 2023 at 19:34 comment added anon @starball, the Site's TOS states: "From time to time, Stack Overflow may make available compilations of all the Subscriber Content on the public Network (the “Creative Commons Data Dump”). The Creative Commons Data Dump is licensed under the CC BY-SA license. By downloading the Creative Commons Data Dump, you agree to be bound by the terms of that license." So the Data Dump's enrichment (internal IDs, enums, etc) is also covered under CC BY-SA, in addition to the subscriber content.
Jun 11, 2023 at 19:28 comment added anon @starball the data dump itself is also licensed under CC BY-SA, in addition to the user content under that license.
Jun 11, 2023 at 15:25 comment added User-o-resU The strikers' organisation needs to download the March dump and fast, if they haven't already done so.
Jun 11, 2023 at 15:11 comment added User-o-resU @chx - Good comment. Yes, the end of Web 2.0. And if I might make a prediction, Google too will stop "providing" so much of their service for "free". (I won't be surprised if they stop letting people use their websearch engine except through an app.) Yes I know most of it is advertising, but still. Web 2.0 was akin to a bait and switch operation.
Jun 11, 2023 at 8:18 comment added starball Mod @Richard Again, SEDE has query timeouts. A data dump does not. And SEDE is hosted by Stack Exchange and Stack Exchange can take that down whenever they want. At this point, I wouldn't even be surprised if they did. But once you've downloaded a data dump, they can't take that away from you (what would they do? knock on your door?). Subscriber content is CC-BY-SA.
Jun 11, 2023 at 8:16 comment added Richard @starball - So use that
Jun 11, 2023 at 8:15 comment added starball Mod @Richard well, at this point, if you're not convinced, it's in the eye of the beholder. data.stackexchange.com/stackoverflow/queries
Jun 11, 2023 at 8:13 comment added Richard @starball - Sorry, but all. I'm seeing is fringe cases and obscure apps that benefit a tiny number of people.
Jun 11, 2023 at 7:53 comment added user152859 @JourneymanGeek it's a first aid plaster to stop the bleeding. Nothing more. (aka give those who want to believe it that SE actually plan to do anything. I don't buy it.)
Jun 11, 2023 at 5:38 comment added starball Mod @Richard "It's just more people saying how fab it is, without any examples of why it's fab and what they're using it for"... there's so much. Basically anything you could use SEDE for. Doing data analysis and research (and all the wide, various applications of that- Ex. disproving false claims about the platform from on high). SEDE has query timeouts. A data dump does not.
Jun 11, 2023 at 2:29 answer added Ajedi32 timeline score: 27
Jun 10, 2023 at 23:02 comment added chx As much as such things exist and can be dated, Jun 9, 2023 was the end of Web 2.0 as we know it with the Reddit CEO's AMA and this post. It's a pity. It was a good run, I guess.
Jun 10, 2023 at 16:21 history edited tripleee CC BY-SA 4.0
Typos
Jun 10, 2023 at 8:31 comment added Journeyman Geek Mod @Richard stackoverflow.blog/2022/10/20/… is one example
Jun 10, 2023 at 8:30 comment added Richard @Starball - That link doesn't help in the slightest. It's just more people saying how fab it is, without any examples of why it's fab and what they're using it for.
Jun 10, 2023 at 8:28 comment added Journeyman Geek Mod @Rosie what does the status planned mean here though? That the data dumps would be released in some form, or SE is digging in its heels and there will be serious restrictions to access by the community that has contributed all the content that's actually in the dump?
Jun 10, 2023 at 8:25 comment added starball Mod @Richard web.archive.org/web/20230203170609/https://stackoverflow.blog/…. Quoting Journeyman Geek, "our insurance policy should the company go evil". That plus the CC-BY-SA license
Jun 10, 2023 at 8:15 comment added Richard I've been using SE for more than a decade and this is the first I've heard about a data dump. Can you explain to me like I'm five what this is for and why normal users should care that your favourite toy just got taken away
Jun 10, 2023 at 6:28 answer added NoDataDumpNoContribution timeline score: 67
Jun 10, 2023 at 1:29 answer added user1376343 timeline score: 1
Jun 10, 2023 at 0:52 comment added wizzwizz4 @DataDude Acceptance, here, would be for visibility. It's important for people to be able to see the official answer. (Answer quality is what votes are for.)
Jun 10, 2023 at 0:29 comment added Data Dude @wizzwizz4 -- The official answer is less complete than the most upvoted answer. I know "official company answers" are usually considered the "most correct" but in this case, I think AMtwo wrote the more informative answer. So I've accepted his.
Jun 10, 2023 at 0:27 vote accept Data Dude
Jun 9, 2023 at 21:23 answer added wizzwizz4 timeline score: 93
Jun 9, 2023 at 20:29 answer added Script47 timeline score: 48
Jun 9, 2023 at 18:30 answer added curious timeline score: 157
Jun 9, 2023 at 18:27 comment added Rosie StaffMod Jody has responded in an answer below. I've updated the status to planned.
Jun 9, 2023 at 18:24 history edited RosieStaffMod
edited tags
Jun 9, 2023 at 18:23 answer added Jody BaileyStaff timeline score: -105
Jun 9, 2023 at 15:16 comment added anon @miku The closest you can come is a GDPR Data Deletion Request. Some more info & discussion here
S Jun 9, 2023 at 14:47 history bounty started Resistance Is Futile
S Jun 9, 2023 at 14:47 history notice added Resistance Is Futile Reward existing answer
Jun 9, 2023 at 14:39 comment added NoDataDumpNoContribution @miku You cannot do that. Your contributions aren't yours anymore. When you posted, you basically gave the content to the company (and everyone) and this is not retractable. You have a right to be dissociated from the content though. And of course you can stop contributing in the future.
Jun 9, 2023 at 14:37 comment added Bryan Krause @miku You don't have an option to delete your contributions, you've already licensed them out (and the terms don't give you any option for revocation). Future contributions you can decide on, though.
Jun 9, 2023 at 14:23 comment added miku Interestingly, the public availability of the data was a main reason I choose to contribute to stack exchange in the first place. If that's not just a delay, I might request deletion of all my contributions.
Jun 9, 2023 at 13:01 answer added anon timeline score: 281
Jun 9, 2023 at 10:49 comment added Gloweye An uncharitable observer might conclude the data contradicts the company's narrative. I'm not feeling particular charitable right now.
Jun 8, 2023 at 20:42 comment added anon Aaron just took the metaphorical slaps for me, by handling the public comms. It was the other redundant DBRE who got the process running smoothly, and me pointing and crying when I had to manually upload files after it got hung.
Jun 8, 2023 at 18:20 comment added user152859 @AMtwo hello! What about Aaron? In all recent cases he was the one to save the day. Well, the dump. :D
Jun 8, 2023 at 15:40 comment added anon @RoryAlsop the data dump is just a full export of SEDE. You're limited by the tooling for timeouts, and doing the analysis in SQL Server, but there's at least some alternative
Jun 8, 2023 at 15:34 comment added anon I'm hoping someone from the Company gives an official response soon. Having been recently laid off, I'm not in a position to give a complete/official answer here... But if there's no official response soon, I'll jump in with what I can offer.
Jun 8, 2023 at 15:31 comment added anon @ShadowWizardStrikesBack I can confirm it has been frequently late since Taryn left. 🤣 The upload to the Archive would frequently hang and need baby sitting. That said, we actually resolved that recently, so the data dump wouldn't be late anymore.
Jun 8, 2023 at 14:01 comment added Taryn @ShadowWizardStrikesBack rude...I disagree that it was never done on time. :)
Jun 8, 2023 at 7:08 comment added user152859 I don't think it was ever done on time. It's a very complex process, hence always failing at some point. I consider it a miracle when it's actually working and there is actually a data dump. (Kind of like launching a spaceship.)
Jun 8, 2023 at 5:57 history edited TinkeringbellMod
edited tags
Jun 7, 2023 at 21:29 comment added Rory Alsop If the data was published it might make it easier to debunk the nonsense...
Jun 7, 2023 at 14:26 history asked Data Dude CC BY-SA 4.0