Используя BeautifulSoup для нахождения HTML-тэга, который содержит определенный текст

Question

Используя BeautifulSoup для нахождения HTML-тэга, который содержит определенный текст

Вы можете изменить индекс требуемых строк df2 и затем объединить два кадра данных,

df1_ind_list = [2,4,6]

df2_ind_list = [0,2,5]

df2_to_merge = df2.loc[df2_ind_list, :]
df2_to_merge.index = df1_ind_list

pd.concat([df1, df2_to_merge], axis = 1)


    Name    Net     Quantity    Cust    COT     DHL
0   Auto    1010.0  10          NaN     NaN     NaN
1   NaN     NaN     12          NaN     NaN     NaN
2   Rtal    4145.0  18          plot    2020.0  10,12
3   NaN     NaN     14          NaN     NaN     NaN
4   Indl    6223.0  16          pplr    5643.0  16,18
5   NaN     7222.0  18          NaN     NaN     NaN
6   lkr     6584.0  13          llrm    9855.0  16,65
7   trml    9854.0  45          NaN     NaN     NaN

61

python regex beautifulsoup html-content-extraction

задан Charles Stewart 28 December 2009 в 16:13

1 ответ

Другие вопросы по тегам:

python regex beautifulsoup html-content-extraction

Похожие вопросы:

score 71 · Accepted Answer

from BeautifulSoup import BeautifulSoup
import re

html_text = """
<h2>this is cool #12345678901</h2>
<h2>this is nothing</h2>
<h1>foo #126666678901</h1>
<h2>this is interesting #126666678901</h2>
<h2>this is blah #124445678901</h2>
"""

soup = BeautifulSoup(html_text)


for elem in soup(text=re.compile(r' #\S{11}')):
    print elem.parent

Печать:

<h2>this is cool #12345678901</h2>
<h2>this is interesting #126666678901</h2>
<h2>this is blah #124445678901</h2>