I am parsing a table in saved .html document, which looks like:
the html codes are like:
<table id="detailBody" width="100%" cellspacing="0" cellpadding="0" border="0" class="tab2" style="display: block;"><tbody> <tr><td><ul><li><span>15:00:19</span><span class="red">11.750</span><span class="red">5392</span><span class="fr red">↑</span></li><li><span>14:56:55</span><span class="red">11.750</span><span class="red">17</span><span class="fr red">↑</span></li><li><span>14:56:52</span><span class="red">11.750</span><span class="red">479</span><span class="fr red">↑</span></li><li><span>14:56:49</span><span class="">11.740</span><span class="green">6</span><span class="fr green">↓</span></li><li><span>14:56:46</span><span class="">11.740</span><span class="green">333</span><span class="fr green">↓</span></li><li><span>14:56:43</span><span class="">11.740</span><span class="green">21</span><span class="fr green">↓</span></li><li><span>14:56:40</span><span class="">11.740</span><span class="green">15</span><span class="fr green">↓</span></li><li><span>14:56:37</span><span class="">11.740</span><span class="green">35</span><span class="fr green">↓</span></li><li><span>14:56:34</span><span class="red">11.750</span><span class="red">11</span><span class="fr red">↑</span></li><li><span>14:56:31</span><span class="">11.740</span><span class="green">3</span><span class="fr green">↓</span></li><li><span>14:56:28</span><span class="">11.740</span><span class="green">24</span><span class="fr green">↓</span></li><li><span>14:56:22</span><span class="red">11.750</span><span class="red">291</span><span class="fr red">↑</span></li><li><span>14:56:19</span><span class="">11.740</span><span class="red">198</span><span class="fr red">↑</span></li><li><span>14:56:16</span><span class="green">11.730</span><span class="green">15</span><span class="fr green">↓</span></li></ul></td></tr> </tbody></table> What I have so far is:
list_a = soup.find_all('table')[0].tbody.find_all("tr") for a in list_a: for b in a: for c in b: for d in c: for e in d: print e.renderContents() even though it doesn't looked very nice, the result is like:
15:00:19 11.750 5392 ↑ 14:56:55 11.750 17 ↑ 14:56:52 11.750 479 ↑ However there are too many contents in the table, I only want the first 10 groups of data in the table. And only the 3rd and 4th items to be put in 2 lists.
i.e.
[“5392”, “17”, “479”, …] and
[“↑”, “↑”, “↑”, …] #the “↑” can be changed to something else identical if it's a problem how can I achieve that? Thanks.
