#Parsing stdout for regex pattern

1 messages · Page 1 of 1 (latest)

gray mist
#

Hi, new Dagger user here!

I'm using dagger python sdk and trying to parse the stdout of an exec command to determine if there is any issue with code execution.

  1. What's the best way to proceed? I wrote an external function running string processing
  2. How to do prints/logsinto the terminal the result of this testing? Right now, print/logging output isn't shown in terminal

Thanks
JP

gaunt wadi
#

can you share your current code snippet?

gray mist
# gaunt wadi can you share your current code snippet?

Thanks a lot for your light-speed guidance!
I'm using Scrapy, a scraping framework in Python. Internally, it uses the standard logging library that streams logs to stdout.
Unfortunately, an exit Code approach like you suggested is impossible because it returns 0 when the queuing scheduler has finished, regardless of whether the crawling worked out.
However, a stats report is generated in the logs at the end of execution, which I am trying to parse to determine whether the test has passed.

2025-01-17 04:24:10 [scrapy.core.engine] INFO: Closing spider (closespider_itemcount)
2025-01-17 04:24:10 [web] INFO: Number of search results: 10
2025-01-17 04:24:10 [scrapy.extensions.feedexport] INFO: Stored jsonl feed (10 items) in: test.jsonl
2025-01-17 04:24:10 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'elapsed_time_seconds': 10.298047,
 'feedexport/success_count/FileFeedStorage': 1,
 'finish_reason': 'closespider_itemcount',
 'finish_time': datetime.datetime(2025, 1, 17, 3, 24, 10, 796087, tzinfo=datetime.timezone.utc),
 'httpcache/hit': 13,
 'httpcompression/response_bytes': 64974,
 'httpcompression/response_count': 2,
 'item_scraped_count': 10,
 'items_per_minute': None,
 'log_count/DEBUG': 46,
 'log_count/INFO': 50,
 'log_count/WARNING': 1,
 'memusage/max': 162893824,
 'memusage/startup': 162893824,
 'request_depth_max': 3,
 'response_received_count': 13,
 'responses_per_minute': None,
 'scheduler/dequeued': 13,
 'scheduler/dequeued/memory': 13,
 'scheduler/enqueued': 24,
 'scheduler/enqueued/memory': 24,
 'scrapy-zyte-api/sessions/use/disabled': 13,
 'start_time': datetime.datetime(2025, 1, 17, 3, 24, 0, 498040, tzinfo=datetime.timezone.utc)}
2025-01-17 04:24:10 [scrapy.core.engine] INFO: Spider closed (closespider_itemcount)

I have several projects like these and was hoping I could monitor which crawlers require fixes.

gaunt wadi