#I have the following code to parse PPM file.

72 messages · Page 1 of 1 (latest)

eager wing
#

I have the following code to parse PPM file.
(Specification can be found here https://netpbm.sourceforge.net/doc/ppm.html)
My question is how can I can get preferably get rid of the first for loop without increasing complexity at lot?
And without using alloc or the std lib.

#![no_std]

use core::slice;

#[derive(Debug)]
pub struct PPM<'a>{
    pub magic: [u8;2],
    pub width: u16,
    pub heigth: u16,
    pub max_col: u16,
    pub data: &'a[u8]
}

impl<'a> PPM<'a> {
    pub fn new(img: *const u8,size: usize) -> Option<Self>  {
        let img_sl = unsafe { slice::from_raw_parts(img, size) };

        let mut data_start = 0;
        let mut fields_seen = 0;
        for (i,x) in img_sl.iter().enumerate() {
            if x.is_ascii_whitespace() {
                fields_seen += 1;
                if fields_seen == 4 {
                    data_start = i;
                    break;
                }
            } else if i > 20 { return None }
        }

        let hdr = unsafe { core::str::from_utf8_unchecked(&img_sl[0..data_start]) };
        let mut hdr_it = hdr.split_whitespace();
        let magic = hdr_it.next().unwrap();
        if magic != "P6" { return None }
        let ppm = PPM {
            magic: *b"P6",
            width: u16::from_str_radix(hdr_it.next().unwrap(), 10).ok()?,
            heigth: u16::from_str_radix(hdr_it.next().unwrap(), 10).ok()?,
            max_col: u16::from_str_radix(hdr_it.next().unwrap(), 10).ok()?,
            data: &img_sl[hdr_it.count()..]
        };
        Some(ppm)
    }
}```
gleaming pecan
#

are you parsing some kind of file format?

eager wing
#

yes PPM

#

"I have the following code to parse PPM file."

gleaming pecan
#

Oh, i scrolled all the way 😄

eager wing
#

Follow the link for the spec for in case you do not trust me, feel free to google the file format instead

#

I encourage not clicking random links honestly :)

gleaming pecan
#

yeah you can simplify this massively

eager wing
gleaming pecan
#

one sec

eager wing
#

My main goal is to get rid of this first loop, looping over the 4 first fields.

#

So that I know where the start of the actual data is

gleaming pecan
#

it can be a single match, like ```rust
impl<'a> PPM<'a> {
pub fn new(img: *const u8, size: usize) -> Option<Self> {
let img_sl = unsafe { slice::from_raw_parts(img, size) };

    if let [b'P', b'6', b'\t' | b'\n' | b'\x0C' | b'\r' | b' ', width1, width2, b'\t' | b'\n' | b'\x0C' | b'\r' | b' ', heigth1, heigth2, b'\t' | b'\n' | b'\x0C' | b'\r' | b' ', max_col1, max_col2, b'\t' | b'\n' | b'\x0C' | b'\r' | b' ', data @ ..] =
        img_sl
    {
        Some(PPM {
            magic: *b"P6",
            width: u16::from_be_bytes([*width1, *width2]),
            heigth: u16::from_be_bytes([*heigth1, *heigth2]),
            max_col: u16::from_be_bytes([*max_col1, *max_col2]),
            data,
        })
    } else {
        None
    }
}

}

#

If i'm reading the spec right

#

it also didnt say anything about endianness so i guessed big endian

eager wing
#

it depends on the last ascii value which is maximum value of color

#

if it's less than 255 then colors are 8bit and in the order of R G B

#

if above 255 they are 16bit R G B

#

( do not handle that and only care about 8bit colors, so feel free to ignore that part of the spec)

gleaming pecan
#

right

#

anyway let me come up with something a bit more scaleable

eager wing
#

it would be a pleasure to see to code of someone more experienced than me

gleaming pecan
#

i would probably split this up, so you can also return useful error messages: ```rust
impl<'a> PPM<'a> {
pub fn new(img: *const u8, size: usize) -> Result<Self, ???> {
let mut img_sl = unsafe { slice::from_raw_parts(img, size) };

    let magic = if let [b'P', b'6', rest @ ..] = img_sgl {
       img_sl = rest;
        *b"P6"
    } else {
        return Err("invalid header")
    }
    
    if let [b'\t' | b'\n' | b'\x0C' | b'\r' | b' ', rest @ ..] = img_sgl {
       img_sl = rest;
    } else {
        return Err("expected whitespace")
    }
    
    let width = if let [width1, width2, rest @ ..] = img_sgl {
       img_sl = rest;
       u16::from_be_bytes([*heigth1, *heigth2])
    } else {
        return Err("invalid length")
    }
    
    // etc
    
    
    Ok(PPM {
            magic,
            width,
            heigth: ,
            max_col: u16::from_be_bytes([*max_col1, *max_col2]),
            data,
        })
}

}```

#

Note how I'm rebinding img_sl so we're advancing along it

eager wing
#

Questions arise

#

what is: @.. in the slice?

gleaming pecan
#

it's basically a binding operator

eager wing
#

specifically why the at

eager wing
gleaming pecan
#

it's very suitable and popular for exactly this kind of thing

#

Also i have a comment about your code, I would strongly recommend you take &[u8] as input to the function, not a raw pointer. let the caller deal with making the slice

eager wing
eager wing
gleaming pecan
#

yes

#

it's also way more convenient if the caller already has a vec or something like that

eager wing
#

yeeeees just one thing

#

this is intended for #![no_std]

#

so assume vec is out of question for now

gleaming pecan
#

your crate may be no_std but someone who is using your crate might not be

#

it applies for slices and arrays as well

eager wing
#

right

#

Thinking about it yeah you're right

#

one thing I'm wondering though is

gleaming pecan
# eager wing I understand but I would like to implement it myself without crates For the sake...

I would recommend you start with something like this: ```rust
enum ReadError {
InvalidLength,
Foo,
Bar,
}

struct Cursor<'buf> {
buf: &'buf [u8],
index: usize
}

impl<'buf> Cursor<'buf> {
fn read_u16(&mut self) -> Result<u16, ReadError> {
if let [a, b, rest @ ..] = self.buf {
self.index += 2;
Ok(u16::from_be_bytes([*a, *b]))
} else {
Err(ReadError::InvalidLength)
}
}

fn read_whitespace(&mut self) -> Result<u16, ReadError> {
    todo!()
}

}
}```

#

i have also written something like this and you really do need this to not make a big mess

eager wing
# eager wing one thing I'm wondering though is

If the header is literally ascii.
Is there some way I can parse it as that?
Because as I see you are parsing it as bytes and maybe I'm misreading the code but how do you know when the data starts when metadata is up to 16bit in ascii so up to 65536, 5 chars.

gleaming pecan
#

ah, is it just "keep reading until you find a whitespace"?

#

I think i misunderstood then

eager wing
#

it's basically variable size ascii encoded fields until whitespace

#

I will apologize, I will take a look at it tomorrow and try to make things better.
I'm just a little drunk rn

gleaming pecan
#

then you do need a loop

#

for example ```rust
impl<'buf> Cursor<'buf> {
fn read_u8(&mut self) -> Result<u8, ReadError> {
if let [a, rest @ ..] = self.buf {
self.index += 1;
Ok(*a)
} else {
Err(ReadError::InvalidLength)
}
}

fn read_until_whitespace(&mut self) -> Result<usize, ReadError> {
    let mut ret = self.read_u8()? as usize;
    loop {
        let next = self.read_u8()?;
        if next.is_ascii_whitespace() {
            return Ok(ret);
        }
        ret = (ret << 8) | next as usize;
    }
}

}```

gleaming pecan
#

Something like parsing a file format is an excellent (first) rust project by the way 🙂

eager wing
gleaming pecan
eager wing
#

This is how for I am hhh

#

the fox is a PPM image

#

with the code I posted

#

But note that doing this kind of project, does NOT mean that I know rust

gleaming pecan
#

Why is it not a picture of ferris? ferrisWhat

eager wing
#

lemme download one

#

this one is not transpatent xD

#

but it works

#
  • transparency is usually converted as black pixels when converting png to ppm
gleaming pecan
#

beautiful