#insert data from webscraper into databank

23 messages · Page 1 of 1 (latest)

keen brook
#

I'm making a webscraper that scrapes information from websites that offer translated series. The idea is to centralize their series and offer them on one site with links towards their chapters. So if i bookmark a serie of theirs, if i want to read it i'll select a chapter and go to their site.

The scraper works and gets all necessary information but i'm struggeling to figure out which is the best way to save it on the database(mysql).
Right now the scraper exists out of abstract class that implements an interface.

It works when i make a request but the code is a warcrime and i'm sure i'm handeling it the wrong way

fathom rune
#

Hey, by databank do you mean database? Just checking before I start barking up the wrong tree

keen brook
#

@fathom rune
yhea sorry, database

fathom rune
#

Mind sending over a code snippet? Kinda dependent on what sort of data you're scraping, but maybe make us of an Eloquent Model?

keen brook
#

tbh, i'm trying multiple things and its all over the place but i'll explain the idea

I'm making a webscraper that scrapes information from websites that offer translated series. The idea is to centralize their series and offer them on one site with links towards their chapters. So if i bookmark a serie of theirs, if i want to read it i'll select a chapter and go to their site.

The scraper works and gets all necessary information but i'm struggeling to figure out which is the best way to save it on the database.
Right now the scraper exists out of abstract class that implements an interface.
The abstract class

abstract class Scraper implements IScraper
{
    protected $url=null;
    protected $src="Scraper";
    protected $requestCounter;
    protected $counter=0;

    public function __construct() {
        $this->requestCounter=0;
    }
    protected static function createChapters($chapterCrawler){

    }

    protected static function addExtraInfo($chapterCrawler){

    }

    public function requestCooldown()
    {

        //echo $this->requestCounter;
        if($this->requestCounter>=10){
            //echo 'going to sleep';
            $this->counter++;
            //echo $this->counter;
            sleep(30);
            $this->requestCounter=0;
        }
        else{
            $this->requestCounter++;
        }
    }
}

The interface

interface IScraper{
    public function run();
    public function serieUpdater();
    public function chapterUpdater();
    public function updateDomain($newDomainName);
    public function requestCooldown();
}
#

One of the scrapers, It's really messy because i've been trying alot of different approached

#

SerieService because I learned the controller shouldn't have to much logic

class ScraperService
{
    protected $scanlators;
    protected $scrapers=[];

    public function __construct($scanlators) {
        $this->scanlators=$scanlators;


        self::selectScraper();
    }

    private function selectScraper(){
        $this->scanlators->each(function($scanlator){
            switch ($scanlator->name) {
                case 'AsuraScans':
                    $this->scrapers[]= new AsuraScansScraper();
                    break;
                case 'FlamesComics':
                    //$this->scrapers[]= new FlameComicsScraper();
                    break;

                default:
                    return "Scraper does not exist";
                    break;
            }
        });
    }
    public function scrapeSerie(){
        foreach ($this->scrapers as $scraper) {
            $scraper->run();
        }
    }


}```
fathom rune
#

Apologies, I'm writing this just before powering down for the night.

I wouldn't worry too much about a service layer, if you're just learning. Just go with whatever feels most intuitive for you (equally, that could be a service layer).

Is your question about how to implement "saving your scraped data to your database" (in which case - is $model->save not working for you?)

Or is it more about the structure of your database?

keen brook
#

right now it works, the table gets its values but i can' help but feel i'm using laravel the wrong way (not about the service)
I don't know how to propperly explain it.
In my last project i used mass-asigment, FromRequests that does the validation, and request.
this is an example, this is in the controller but look how easily the data is saved and stored.
What i'm doing is so messy and I can't but wonder if there isn't a way to make it more efficient and clean

public function store(QuestionFormRequest $request)
    {
        $validated=$request->validated();
        $category=Category::findOrFail($request->input('category_id'));

        $question=$request->user()->questions()->create($validated)->category()->associate($category);

        $question->save();


        return redirect()
                ->route('questions.show',[$question])
                ->with('success', 'Question is submitted! Title: '.
                $question->title);
    }```
#

my apologies if I can't explain it properly

fathom rune
#

Do you mind showing the request? Although I really am going to bed after this message lol. I'll reply again tomorrow!

And maybe the Category and Question models too?

#

to be clear, by request I mean the QuestionFormRequest class

keen brook
#

gn,
QuestionFormRequest

class QuestionFormRequest extends FormRequest
{
    /**
     * Determine if the user is authorized to make this request.
     */
    public function authorize(): bool
    {
        return true;
    }


    /**
     * Get the validation rules that apply to the request.
     *
     * @return array<string, \Illuminate\Contracts\Validation\ValidationRule|array<mixed>|string>
     */
    public function rules(): array
    {
        return [
            'title' => ['required', 'string', 'max:50'],
            'question' => ['required', 'string', 'max:500'],
            'anwser' => ['required', 'string', 'max:500'],
            'category_id'=>['required',]
        ];
    }
}```

``Category and Question models``
```php
class Category extends Model
{
    use HasFactory;

    protected $fillable=['name',];

    public function user():BelongsTo
    {
        return $this->belongsTo(User::class);
    }
    public function questions():HasMany
    {
        return $this->hasMany(Question::class);
    }
}
class Question extends Model
{
    use HasFactory;
    protected $fillable=['title','question','anwser','category_id'];

    public function user():BelongsTo
    {
        return $this->belongsTo(User::class);
    }

    public function category():BelongsTo
    {
        return $this->belongsTo(Category::class);
    }
}
muted root
#

Draw out a database diagram, and it'll almost certainly start to make sense

#

Otherwise you'll go round in circles!

#

Make sure to wrap any findOrFail with some try/catch logic

#

You've made "category_id" in Question fillable, so you could just fill in category_id in your question create step, and then you don't need to do the associate() step, as it's already done, but that's not best practice if you could have security concerns around categories.

keen brook
#

the questions and categories are an example of my previous project where you can cleanly make a question and give it a category

Right now i'm making a webscraper that's doing something similar but rather in an very ugly manner so I'm asking if it's possible to make it like the questions and category

fathom rune
#

Hey, sorry im back.

$question=$request->user()->questions()->create($validated)->category()->associate($category);

seems a little lengthy - what's the goal of the category()->associate($category) bit?

#

looks like I haven't quite got the message formatting right - how did you do it above?

keen brook
#

three `` and you write the language in the start, in this case php, you press enter and paste your code

#

it's to set the relation between question and category

fathom rune
#

Thanks.

Would


$question=$request->user()->questions()->create($validated)

not suffice?