Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

migrar classificador de PHP para Python #30

Open
JaTvoiRabotnik opened this issue Feb 13, 2017 · 0 comments
Open

migrar classificador de PHP para Python #30

JaTvoiRabotnik opened this issue Feb 13, 2017 · 0 comments
Assignees

Comments

@JaTvoiRabotnik
Copy link

Algoritmo do "classificador humano" esta em clean2.php, consiste de achar as assinaturas "TABLE topo_materia" e "DIV pagina", ficando apenas com o miolo da DIV,

$tabtopo = $xpath->query('//table[@class="topo_materia"]');
if (strlen(trim($tabtopo->item(0)->nodeValue))>100){
    $div = $xpath->query('//td//div[@id="pagina"]');
    if (strlen(trim($div->item(0)->nodeValue))>100) {
        print ".. div id-pagina..";
        $htm = $dom->saveXML($div->item(0));
    }
}

podemos já migrar esse código para Python.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant