-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ModuleNotFoundError: No module named 'frontera.contrib.scrapy.middlewares.seeds' #371
Comments
Please, use StackOverflow to ask this type of questions. See also https://stackoverflow.com/help/mcve and https://stackoverflow.com/help/how-to-ask |
@Gallaecio Also, I checked all the previous questions, @sibiryakov is very responsive to solve the problem, this is why I am asking here. I will try to ask in stackoverflow... I have uploaded the question to stackoverflow, https://stackoverflow.com/questions/56493245/modulenotfounderror-no-module-named-frontera-contrib-scrapy-middlewares-seeds Sorry i have not enough reputation to post image in stackoverflow. but i use i used imgur.com instead. |
@sibiryakov I found a solution for this error
You should add this line
before
Seems like partitions_for_topic does not request a metadata refresh, whereas topics does. No clue why this worked in kafka-python 1.4.4, as it seems the two functions have not changed. Maybe metadata was always refreshed asap when creating the consumer in 1.4.4? Making partitions_for_topic call the same code as topics before returning the partitions seems to solve the problem obviously. Have a look they are fixing this problem recently |
@sibiryakov
When i inject the seeds file by command below,
i got this error in the meanwhile in db worker terminal
But i still get 0 page crawled... Pls help me when you are free sir, thanks in advance! |
Hi @liho00 your seeds weren't injected, because the strategy worker was unable to create the table |
@sibiryakov Hi, I am sure i have created the namespace crawler before, and i am also sure the queue table was created..., i need to clarify that im using frontera v0.8.1 as the 'frontera.contrib.scrapy.middlewares.seeds' has been removed at this version. after i tried again the error still show up after key in this command
But after few second it show the seeds injected? I am still getting 0 page crawled Besides that, can you tell me how to inject the seeds? If this module is not needed,
i should inject the seed into my strategic worker? Lastly, i cannot force close my crawler, it trapped in an endless loop |
solved by downgrade kafka-python to v1.4.4 |
If that’s the only fix, then we need to either update |
@Gallaecio it should be a tiny PR #371 (comment) |
Besides that, I cannot force close the spiders, it trapped in an endless loop
|
I also get the same problem. How can we solve this? |
Getting the same issue here |
@sibiryakov Hi, thanks your suggestion about the kafka. But i have installed it in my pc. I tend to build kafka+hbase crawler.
I have few questions, first when i run this command
python -m frontera.utils.add_seeds --config tutorial.config.dbw --seeds-file seeds.txt
scrapy crawl tutorial -L INFO -s SPIDER_PARTITION_ID=0
i got this error
ModuleNotFoundError: No module named 'frontera.contrib.scrapy.middlewares.seeds'
after i removed, i can run the scrapy, but 0 page crawled
SPIDER_MIDDLEWARES = { 'frontera.contrib.scrapy.middlewares.schedulers.SchedulerSpiderMiddleware': 999, ̶ ̶ ̶ ̶'̶f̶r̶o̶n̶t̶e̶r̶a̶.̶c̶o̶n̶t̶r̶i̶b̶.̶s̶c̶r̶a̶p̶y̶.̶m̶i̶d̶d̶l̶e̶w̶a̶r̶e̶s̶.̶s̶e̶e̶d̶s̶.̶f̶i̶l̶e̶.̶F̶i̶l̶e̶S̶e̶e̶d̶L̶o̶a̶d̶e̶r̶'̶:̶ ̶1̶,̶ }
besides, my kafka didnt consume any message
All my config is followed the document cluster setup guide.
For the kafka problems.
after i add this line
MESSAGE_BUS = 'frontera.contrib.messagebus.kafkabus.MessageBus'
and remove̶ ̶ ̶ ̶'̶f̶r̶o̶n̶t̶e̶r̶a̶.̶c̶o̶n̶t̶r̶i̶b̶.̶s̶c̶r̶a̶p̶y̶.̶m̶i̶d̶d̶l̶e̶w̶a̶r̶e̶s̶.̶s̶e̶e̶d̶s̶.̶f̶i̶l̶e̶.̶F̶i̶l̶e̶S̶e̶e̶d̶L̶o̶a̶d̶e̶r̶'̶:̶ ̶1̶,̶
i got this problem when i start db worker, stategic work and crawler.
my config
common.py
dbw.py
spider.py
sw.py
worker.py
how i create kafka topic
kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 2 --topic frontier-done
kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 2 --topic frontier-todo
kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 2 --topic frontier-score
I set the partition to 2 in common.py,
how i start kafka
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic frontier-done --from-beginning
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic frontier-todo --from-beginning
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic frontier-score --from-beginning
Version of tools
Name: frontera
Version: 0.8.1
Name: Scrapy
Version: 1.6.0
Name:Python
Version:3.7.3
Name:Kafka
Version:2.2.1
I think may be the doc didnt update to v0.8.1, it still stay at v0.8.0.1.
Should i downgrade the frontera to the table version v0.8?
But myself love to use the latest version instead.
Thanks in advance!
The text was updated successfully, but these errors were encountered: