Skip to main content
18 events
when toggle format what by license comment
Jun 11, 2020 at 12:04 history edited CommunityBot
Commonmark migration
Dec 18, 2014 at 14:38 comment added MERose True. But fortunately, we don't have a King Richard Iii in our dataset. However, I really appreciate the work of you all!
Dec 18, 2014 at 14:09 comment added don_crissti @MERose - either way, you will never get it right via a script like this (unless you manually fix the wrong stuff afterwards). You'd need a dict to handle all particular cases. And even then... Here's another one: KING RICHARD III - it's really ugly when converted to King Richard Iii.
Dec 18, 2014 at 13:38 comment added mikeserv @MERose - well, that's good. maybe you can find a use for it...
Dec 18, 2014 at 13:35 comment added MERose Yes, that's what I meant in the above code. We need D'Artagnan and don't care if d'Artagnan is the only correct version. We just think that D'Artagnan is still better than D'artagnan or D'ARTAGNAN.
Dec 18, 2014 at 12:36 comment added mikeserv @MERose - letters following a - are not converted - remember this only does upper to lower case conversions - it doesn't go the other way. So if a letter following an apostrophe is already uppercase it will remain that way - as is also true for dashes. Both of these are represented in the example output. Otherwise though - the lowercase d in d'Artagnan - it gets much more difficult very quickly. I can do it... but what of O'Malley for example? It's very arbitrary.
Dec 18, 2014 at 12:29 comment added MERose Why not? d'Artagnan is the correct version, so the letter following ' (also those following a -) should be capitals. We don't care it the letter in front of that sign is upper- or lowercase.
Dec 18, 2014 at 10:16 history edited mikeserv CC BY-SA 3.0
deleted 4 characters in body
Dec 18, 2014 at 10:15 comment added mikeserv @MERose - wait what? You want letters following ' apostrophes not to be converted? That's simply done - but the whole D'ARTAGNAN thing is another story altogether... Ok, it won't convert the letters following a ' now.
Dec 18, 2014 at 9:26 comment added MERose In fact, it's good when 'T stays 'T because, as I wrote, it's names that I am going to fix and most words should start with capital letters. Look for example D'ARTAGNAN, D'HONDT or DELL'ALBA.
Dec 18, 2014 at 2:15 history edited mikeserv CC BY-SA 3.0
deleted 2 characters in body
Dec 18, 2014 at 2:07 comment added mikeserv @don_crissti - I probably could handle that stuff - but not without an explicit definition list. Not in a simple script, anyway. I'm halfway imagining something like the global s///ubstitution I used here like sed ... "s/\($(printf "\(...%s...*\)*" "$@")\).*/\1/" but I'd need at least two passes at input and a whitelist array. I really appreciate your comments on this stuff, by the way - always spot-on, too. Weird that other people think about it too, actually.
Dec 18, 2014 at 1:41 comment added don_crissti Yes, I think it's better now. As I said, it's almost impossible to deal with names like D'Artagnan or MacBride.
Dec 18, 2014 at 1:29 history edited mikeserv CC BY-SA 3.0
edited body
Dec 18, 2014 at 1:23 comment added mikeserv @don_crissti - Oh - that's strange. I just changed it around. I probably won't handle VanZant... but, Don'T shouldn't be a problem, I think. Well, the apostrophe thing works now. But... VanZant is right out.
Dec 18, 2014 at 1:22 history edited mikeserv CC BY-SA 3.0
added 1123 characters in body
Dec 18, 2014 at 0:54 comment added don_crissti The problems that all (possible) answers will have to deal with (an impossible quest IMO ) are mainly apostrophes/proper nouns. DON'T KNOW WHAT'S UP is replaced with Don'T Know What'S Up although it should be Don't Know What's Up. No big deal to fix, I know, you can lowercase any letter following an apostrophe but then O'MALLEY becomes O'malley. Another thing: MCCABE & MRS. MILLER should be replaced with McCabe & ... not Mccabe & ..., PAIGE VANZANT should be replaced with Paige VanZant etc. Depending on the input file, this type of substitution can turn into a real nightmare.
Dec 18, 2014 at 0:09 history answered mikeserv CC BY-SA 3.0