Token Too Long error using CloudTrail Parser | CrowdSec | Page 1

bold gyro Nov 22, 2024, 9:21 PM

#

Hey Guys!

I'm using the aws-cloudtrail parser (https://app.crowdsec.net/hub/author/crowdsecurity/configurations/aws-cloudtrail) to retrieve data from my S3 bucket. I'm leveraging the SQS configuration.
Everything seems to be working fine, but when I check the logs I'm receiving the following error:

time="2024-11-22T21:14:09Z" level=error msg="Error while reading file: failed to read object BUCKET_NAME/PREFIX.json.gz: bufio.Scanner: token too long" method=readManager queue="https://sqs.us-east-1.amazonaws.com/ACCOUNT/SQS_QUEUE_NAME" type=s3

I tryed changing the max_buffer_size in the acquis.yaml configuration, but it didn't work. Here's what I used: max_buffer_size: 1000000000

Does anyone have any ideia of how can I resolve this issue?

CrowdSec Console

Hub configuration

Use CrowdSec Console to visualize security data, manage dynamic blocklists, and gain real-time intelligence on IPs. Enhance your threat response capabilities.

dire sandalBOT Nov 22, 2024, 9:21 PM

#

Important Information

Thank you for getting in touch with your support request. To expedite a swift resolution, could you kindly provide the following information? Rest assured, we will respond promptly, and we greatly appreciate your patience. While you wait, please check the links below to see if this issue has been previously addressed. If you have managed to resolve it, please use run the command /resolve or press the green resolve button below.

Log Files

If you possess any log files that you believe could be beneficial, please include them at this time. By default, CrowdSec logs to /var/log/, where you will discover a corresponding log file for each component.

Guide Followed (CrowdSec Official)

If you have diligently followed one of our guides and hit a roadblock, please share the guide with us. This will help us assess if any adjustments are necessary to assist you further.

Screenshots

Please forward any screenshots depicting errors you encounter. Your visuals will provide us with a clear view of the issues you are facing.

dire spear Nov 25, 2024, 9:10 AM

#

Can you check in your cloudtrail logs the length of the longest lines ?

max_buffer_size should be bigger than it (although 1000000000 seems plenty enough).

Do you see any logs with Setting max buffer size to XXXX ?

bold gyro Nov 26, 2024, 1:55 PM

#

dire spear Can you check in your cloudtrail logs the length of the longest lines ? `max_bu...

Hey, sorry for the delay.

The longest line is 184 characteres
Yes it's showing the message: "Setting max buffer size to 10000000000" in the logs. With the sqs queue and type s3.

#

I think that the problem is really the longest bucket file address

dire spear Nov 26, 2024, 1:58 PM

#

There's not any limit on this AFAIK
We do use this parser internally, reading cloudtrail logs from S3, and we don't have this isssue

#

for reference, this is our acquisition config:

polling_method: sqs
sqs_name: cloudtrail-queue
sqs_format: s3notification 
polling_interval: 30
aws_region: eu-west-1
transform: map(JsonExtractSlice(evt.Line.Raw, "Records"), ToJsonString(#))
max_buffer_size: 10000000
use_time_machine: true
labels:
  type: aws-cloudtrail

#

(the polling_interval is useless, probably a leftover from some tests)

#

can you try with max_buffer_size set to 10000000 ?
I'm wondering if 1000000000 triggers an overflow somewhere in go / it's too big for the memory allocation to succeed

bold gyro Nov 26, 2024, 4:44 PM

#

I've tryed here, but the same error happens. The only difference that I've seen between our configuration is the sqs_name. I use the SQS in another account.
Follow my config:

source: s3
polling_method: sqs
sqs_name: "https://sqs.us-east-1.amazonaws.com/[ACCOUNT_NUMBER]/CloudTrail"
sqs_format: s3notification 
polling_interval: 30
aws_region: us-east-1
max_buffer_size: 10000000
transform: map(JsonExtractSlice(evt.Line.Raw, "Records"), ToJsonString(#))
use_time_machine: true
labels:
  type: aws-cloudtrail```

dire spear Nov 26, 2024, 4:57 PM

#

and if you run cscli metrics, do you see any lines read at all from S3 ?

#

I might have an idea

When you say The longest line is 184 characteres, where have you checked ?
in the S3 bucket directly ?

If not, can you download a file that is mentioned in the error message, and check the length of the lines in it (cloudtrail will bundle a lot of events on the same line)

#

and also, if you are willing to share one file triggering the error so I can reproduce locally, it would be amazing (I know it will more than likely contains a ton of private data such as account ids, AWS usernames and so on, but cloudtrail will (in theory) remove anything really sensitive like credentials)

bold gyro Nov 26, 2024, 6:34 PM

#

dire spear and if you run `cscli metrics`, do you see any lines read at all from S3 ?

hey blotus, sorry I misunderstood your question. Actually the json file has just 1 line with 91.500 characters

dire spear Nov 26, 2024, 6:51 PM

#

For reference, I just had another look at our logs, and turns out we sometimes get this error (I just found a file with a single 12-millions-characters line in it)
but no issue if I do increase the max_buffer_size

Can you try to run this command on the machine where your crowdsec is running (replace the region with the one you are in):

AWS_REGION=eu-west-1 crowdsec -dsn "s3://bucket/path/to/the/object.json.gz?max_buffer_size=100000" --type cloudtrail -no-api

and play with the value max_buffer_size until it succeeds

bold gyro Nov 26, 2024, 10:29 PM

#

dire spear For reference, I just had another look at our logs, and turns out we sometimes g...

Interesting and really strange! passing the max_buffer_size in the copy it works. But if I just remove it, shows the error.
🤔
If I remove the ?max_buffer_size the error is showed. And it is configured in the acquis configuration

dire spear Nov 26, 2024, 10:59 PM

#

if you remove it, it will default to 65k, so it's expected to fail in your case.

If I understand correctly, you are saying that it does work when you do the replay of file, but using the same value for max_buffer_size in the acquis.yaml does not work ?

bold gyro Nov 27, 2024, 1:03 PM

#

Exactly.

dire spear Nov 28, 2024, 11:07 AM

#

It doesn't make any sense :/
It's the same code that is used to read the file in both cases.

The only difference I can think of is that, in some cases, the AWS SDK seems to automatically decompress the gz file for us, and sometimes not (and it's not clear when it happens).
Depending on whether it was decompressed or not, we read the file slightly differently, maybe the issue is coming from that, I'll try to test some things

bold gyro Nov 28, 2024, 12:30 PM

#

I was though about it. The command dsn is directly in the S3 write? The acquis is configured to use a SQS queue, maybe it's get differents results for that.
It's really strange, it's like the acquis can't apply the max_buffer_size. At the same time, it generate logs like setting max buffer size

bold gyro Dec 5, 2024, 7:14 PM

#

dire spear It doesn't make any sense :/ It's the same code that is used to read the file in...

It's possible to run a dsn command using the sqs address? just to try

dire spear Dec 6, 2024, 12:04 PM

#

bold gyro It's possible to run a dsn command using the sqs address? just to try

no, the DSN command will fetch the file directly (SQS is just there to tell crowdsec "look, there's a new file here")

solemn copper Apr 9, 2025, 2:42 PM

#

dire spear It doesn't make any sense :/ It's the same code that is used to read the file in...

Hi, bringing up this topic again, was there some update here?