Resume eval thread | OpenAI | Page 1

sterile ingot Sep 25, 2023, 3:53 PM

#

creating a thread for this

white juniper Sep 25, 2023, 4:03 PM

#

You're not going to get good 'math' here. The model is designed to use words, not numbers.

Instead of a score with numbers, encourage it to use words, and even guide it with 'excellent match', 'average match' and 'poor match', which one is this most likely to be - for the model was also trained that it doesn't evaluate like a human, and it should tell you it may not make value judgements like you or other humans might. If you provide clear evidence that you understand those issues, it is less constrained to provide warnings and educate about that concern.

#

@misty viper

misty viper Sep 25, 2023, 4:40 PM

#

prompt
The ATS score is a measure of how well the resume matches the job description based on the keywords, skills, qualifications, and other criteria that the employer is looking for. ATS understands work history, job titles, relevant skills and education as well as contact information like your name, phone number, and email address. and relevance to the job description.
The score ranges from 0 to 100, where 0 means no match and 100 means perfect match.
To calculate the ATS score, the model should do the following steps:
- Identify the keywords,hard skills, soft skills, qualifications, and other criteria that are relevant for the job description.
- Compare hard skills from resume to job description .Hard skills have a high impact on your ATS score.Hard skills enable you to perform job-specific duties and responsibilities.Hard skills mentioned in the resume should matched with required skillset for job description role
- Compare soft skills from resume to job description .Soft skills have a medium impact on your ATS score.Soft skills are your traits and abilities that are not unique to any job.Soft skills mentioned in the resume should matched with required skillset for job description role
- Compare qualifications or certification matches with role from job description. Qualifications or Certifications have a high impact on your ATS Score. Qualification or Certifications from resume should be relevant to the role from job description.
- Compare keywords that are words included in the job description more than 3 times and not hard skills or soft skill.keywords have a low impact on your match score.
- Assign a weight to each item based on its importance for the job
- Sum up the weighted counts of each item to get the total score
- Normalize the total score to be between 0 and 100
Only provide the ATS score for the given resume and job description, explain why the score is high or low and How you calculated the score

misty viper Sep 25, 2023, 4:40 PM

#

misty viper prompt The ATS score is a measure of how well the resume matches the job descri...

this is the system prompt i am using

#


def prompt_create(resume,job_description):
    json_format ="""{
     "ATS_Score": "The overall score given to the resume by an Applicant Tracking System (ATS) (Should be from 0-100)" ,
    "Why": "In this you have to explain in detail how you compared the resume and jd and why the score is less or high."
    "How": "Give me explanation of how you calculated the score so i can calculate the ats_score too and provide the score for each step"
    }"""
    prompt = f"""
    \n
    Resume: "
    {resume}
    "
    
    \n
    Job Description: "
    {job_description}
    
    Calculate ATS Score:
   Do not include any explanations, only provide an RFC8259 compliant JSON response following this format without deviation: 
   {json_format}
    """
    
    return prompt

THis is how i am creating user prompt

sterile ingot Sep 25, 2023, 4:48 PM

#

Are you stuck on using the ATS score?

#

As @white juniper mentioned - word based evaluation would be a better approach

white juniper Sep 25, 2023, 4:49 PM

#

In my experience, the model just can't output math assessment well. Add in value judgements, and it's drive to be ethical and kind, and you get hallucinations galore and likely score inflation (rarely it will go low and harsh, siding with 'concern' to the harm a high score might cause -you- instead of the evaluee, and then it is likely to stay similarly critical to all, perhaps to maintain 'fairness'.

If it could do what you want - your prompt is very clear and specific.

white juniper Sep 25, 2023, 4:51 PM

#

misty viper prompt The ATS score is a measure of how well the resume matches the job descri...

However, I have one idea. Mind sharing an example task as well? Perhaps that engineer job description and doctor resume.

I don't think this would solve the math problem, but I may have a way to redefine much of the ethics componet, and I'm curious how that would turn out.

misty viper Sep 25, 2023, 4:53 PM

#

sterile ingot As <@215370453945024513> mentioned - word based evaluation would be a better ap...

I tried this prompt too but the results were quite same

Your role as an expert and strict Resume and Job Description Comparison Tool is to conduct a thorough analysis of a given resume and job description. Your objective is to generate a comprehensive JSON comparison report following the RFC8259 standard . The report should include an ATS score, Skills Match, Experience Alignment, Education and Certifications, Keyword Optimization, Highlighting Relevant Achievements, Gaps, and Development Needs. The ATS score, ranging from 0 to 100, will strictly reflect the degree of alignment between the resume and the job description, penalizing more for significant differences in the job role, experience, skills, or education sections. Your recommendations should be tailored to the specific resume and job description, focusing on improving the ATS score by considering factors such as relevant skills, experience, education, certifications, and optimized keyword usage. It is crucial that all comparisons are strictly related to the resume and job description, avoiding general matches.```

sterile ingot Sep 25, 2023, 4:54 PM

#

Here is how I would approach this - identity for the criteria for evaluating the resumes (drop teh ATS score). Come up with different examples for what makes a given resume strong and why (this is important). Then use these example input and output examples in your prompt .

#

This will greatly enhance your results and output reliability

#

ideally you can get to the point where you have a good # of examples and then you can have a model choose the best examples to include in context for evaluating a given resume.

misty viper Sep 25, 2023, 4:56 PM

#

sterile ingot Here is how I would approach this - identity for the criteria for evaluating the...

can you share one example so i can understand better

sterile ingot Sep 25, 2023, 4:56 PM

#

ya - hang on

misty viper Sep 25, 2023, 5:04 PM

#

this json file contains the engineer resumes comparision with jd of doctor

📎 test.json

#

thanks for supports from all of you . i really need help because i am software engineer somehow i got a job as ai engineer and have to complete this task i am just trying different prompt but nothing seems to be helping .first my result was random for the same resume .i tried testing today and by changing the tempreture to 0 my results for same jd and resume now it appears same

#

i was approaching the problem differently by using spacy and other packages like pyresparser for parsing and after parsing i was calculating embedding so i can do similarity match just like Resume-Matching repository or other repos are doing but my colleagues says that instead of doing everything yourself .just extract pdf text and job description text and pass it to gpt model it will do everything .

sterile ingot Sep 25, 2023, 5:15 PM

#

#

idk why it is an image lol

#

but you would provide multiple exampels like this for Low/Medium/High (or however you want to rate them)

#

the model will use these in-context examples as "guides" so to speak when doing it's evaluation

#

this also is helpful for getting reliable ouptut in JSON format

#

manualy adjusting the Chain of thought to follow the evaluation process you want is very powerful

misty viper Sep 25, 2023, 5:20 PM

#

thanks for the help now i understand somthing
instead of asking for numerical values i should ask for categorical values ( Low/Medium/High )

#

where i have to provide examples or guides do i have to fine tune the model for specific problems

sterile ingot Sep 25, 2023, 5:21 PM

#

Yes - that will likely get you better results

#

in this case you aren't going to be doing fine tuning of the model

#

you could do that eventually

#

but what you are actually going to be doing is building high quality examples to provide to the model

misty viper Sep 25, 2023, 5:27 PM

#

sterile ingot but what you are actually going to be doing is building high quality examples to...

well but how i would provide the examples to model like i have to maintain the list of messages where the different examples are saved so model will get the idea from it

sterile ingot Sep 25, 2023, 5:29 PM

#

You will include them as part of your prompt

#

I adjust the chain of thought output in this example to better align with your original prompt:

{
"rating": "low",
"chain_of_thought": [
"Identifying relevant criteria from the job description: The key hard skills for the Digital Marketing Specialist role include SEO/SEM, email marketing, social media, display advertising, analytics, and landing page creation.",
"Comparing hard skills from the resume to the job description: Jane Doe's resume primarily focuses on software engineering and system design. There is no mention of SEO/SEM, email marketing, social media, display advertising, or other relevant digital marketing hard skills.",
"Comparing soft skills from the resume to the job description: The job description emphasizes collaboration, analytical ability, and thought leadership. While Jane's resume mentions problem-solving and critical thinking, there is no evidence of her having experience in collaboration in a marketing context or providing thought leadership in the digital marketing space.",
"Comparing qualifications or certifications with the role from the job description: Jane's certifications, including the Certified Software Development Professional (CSDP) and UX Design Institute Professional Diploma in UX Design, are more aligned with software development and UX design, not digital marketing.",
"Comparing keywords from the resume to the job description: The resume lacks many of the keywords present in the job description such as 'digital marketing campaigns', 'brand awareness', 'website traffic', 'leads/customers', and 'emerging technologies'."
],
"summary": "Jane Doe's resume does not align well with the requirements and responsibilities of a Digital Marketing Specialist. Her hard skills, qualifications, and keywords are more focused on software engineering, and while she has some relevant soft skills, they are not presented in the context of a digital marketing role."
}

#

So ideally you would provide a couple of releveant exampels to the model for a given job description

#

at a minimum an example of one low, one medium, one high

misty viper Sep 25, 2023, 5:42 PM

#

sterile ingot You will include them as part of your prompt

well i will provide examples to it and will follow the guidelines of yours and thanks for the help. I don't have words to thanks yours for the time and efforts .
One dumb question i have what if i don't provide relevant examples . i don't have much knowledge related to AI. I think it already knows that software engineer job is different from doctor job.
like in my case even though it knows the resume is of software engineer and job description is of doctor and their skills set are different but stills it give me the good score why do u think it happens.

white juniper Sep 25, 2023, 5:44 PM

#

misty viper well i will provide examples to it and will follow the guidelines of yours and t...

It guesses without your guidance.

That could be fine for your needs, especially if your needs are very common.

However, the guess will have no guidance from you, and can come from any part of the AI's training dataset, including poor examples, and it won't have any feedback to guide it to only give answers that better match your own.

Also, it will be more likely to pick 'how to give the reply' from its training dataset range of answer styles, which might be far outside the JSON form you prefer.

misty viper Sep 25, 2023, 5:46 PM

#

Thanks for the explanation sir it clears my doubt .

#

Your efforts and time mean the world to me, and I'm truly grateful. I don't have enough words to express my thanks. If you ever need assistance from me, count on my full support. I'm here for you and community.

white juniper Sep 25, 2023, 5:51 PM

#

misty viper Thanks for the explanation sir it clears my doubt .

A couple possibly useful suggestions:

The AI tends to focus well on small, single steps. You have a LOT of focused steps.

You might actually want to have the AI process this sequentially instead of simultaneously, because if you ask it to consider 50 factors at once it shallowly averages them all (or it throws away most and focuses on a few ones it picked as key, this is especially common with disallowed content concerns).

But if you ask it to consider 1 or 2 factors it will deeply evaluate those few, not skipping any.

You can then use a program perhaps to put the many separate factor conclusions into the JSON detailed output you want.

Examples: You may be able to give some vague examples, like 'a doctor should be identified as a bad match for a software engineer' without going into details. However, focus on what matters, "We want to value computer-related experience highly, even if it's abstract" or "We are looking for someone already highly experienced with this exact position, so be highly critical about non-specific but related experience".

misty viper Sep 26, 2023, 5:00 PM

#

Well if i share my python code can you help me that my examples are correct or i am missing something from the suggestions you have given me.

misty viper Sep 26, 2023, 5:18 PM

#

this is the file i generated from the notebook i am trying to send the link buts its not allowing me

#

this is the file contains all the example and prompt

📎 New-API-Prompt-Testing.pdf

white juniper Sep 26, 2023, 5:34 PM

#

misty viper this is the file contains all the example and prompt

Are you aware of context limits? I work just with the web interface, not the playground, but I think there too you can only give so much information at once, and the model only has 'some amount' of the current conversation it can keep in memory.

So you can't give it a giant flood of info, but with some plug ins the web interface on a chatGPT+ and using 4 with plugins could perhaps read the PDF and use it to make decisions.

misty viper Sep 26, 2023, 5:58 PM

#

white juniper Are you aware of context limits? I work just with the web interface, not the pl...

I am passing the examples as a list of dict where the role is system and content is example.
My first item in a list is a system prompt about the task i want my gpt model. Then a list of examples and the final item in a list is resume and job description. I didn't thought about the context limit. But if there is a issue of context limit. It might have given me warning of token limit or something like that. What my today findings are my results are coming in between low to average and the responses are incomplete. Even though response limit is of 16k. I will check tomorrow again what's the issue about incomplete response.
I will try to be aware of context limit next time

#

Well what do you think if I approach this problem using keyword extraction without gpt stuff. Don't you think it will work better.

white juniper Sep 26, 2023, 6:03 PM

#

misty viper Well what do you think if I approach this problem using keyword extraction witho...

Not my field, outside my experience. I recommend try and see, go with what works, iterate and improve 🙂

misty viper Sep 26, 2023, 6:13 PM

#

white juniper Not my field, outside my experience. I recommend try and see, go with what work...

Ok thanks man I really appreciate your efforts and support.

sterile ingot Sep 26, 2023, 6:33 PM

#

@misty viper - looks like you might have hit you API limits is all

#

should be able to see in the openai portal for sure

#

you defintley might run into some context limits

#

Can you provide some exampel inputs / outputs where the responses are incomplete?

#

or they Chain of Though is giving you somethign unexpected

misty viper Sep 26, 2023, 6:37 PM

#

sterile ingot <@1031513584469016617> - looks like you might have hit you API limits is all

Let me check

sterile ingot Sep 26, 2023, 6:37 PM

#

I am guessing off - ": You exceeded your current quota, please check your plan and billing details."

#

I beleive that is the error when you have hit your qoute - you have to email them to get it extended

misty viper Sep 26, 2023, 6:38 PM

#

sterile ingot I am guessing off - ": You exceeded your current quota, please check your plan ...

Yes that's why I couldn't make a better progress today

misty viper Sep 26, 2023, 6:38 PM

#

sterile ingot I beleive that is the error when you have hit your qoute - you have to email the...

Thanks for the tip

sterile ingot Sep 26, 2023, 6:38 PM

#

but your examples look really strong to me

misty viper Sep 26, 2023, 7:15 PM

#

this is the output json its incomplete in the last response .I could not show more responses which are incomplete because notebook is stored in my office pc and i can only get the values from notebook output

📎 incomplete.json

sterile ingot Sep 26, 2023, 8:13 PM

#

These look complete to me?

sterile ingot Sep 26, 2023, 8:40 PM

#

@misty viper - which ones are incomplete?

misty viper Sep 26, 2023, 8:49 PM

#

sterile ingot These look complete to me?

last one in the json .and after that are missing too but i could not send them right now .i will send you in 12 hours right now i am at home

sterile ingot Sep 27, 2023, 3:27 PM

#

I assume you are doiing these one at a time right?

misty viper Sep 27, 2023, 3:37 PM

#

yes

#

one resume is compared with one jd and then i am appending all the resume comparision with other jds

#

thanks for your help .now my manager is not happy with my progress so they are telling me to use spacy and other open source implementations to make things quicker. Now we are doing things by calculating similarity score through cosine similarity and i am checking other methods too. thanks to you guys I learnt a lot from you

sterile ingot Sep 27, 2023, 4:27 PM

#

misty viper one resume is compared with one jd and then i am appending all the resume compar...

Ya - this approach will defintley eventually hit the context limits.

#

I honestly think you are on the right path - just getting your examples tuned in to incldue in the prompts

#

ideally you would have a model llm or otherwise that would pick the examples for you

#

as a preprocessing step

#

if you research - in context example selection model - you will find this is an area of growing research

misty viper Sep 27, 2023, 4:43 PM

#

sterile ingot Ya - this approach will defintley eventually hit the context limits.

thats what i realized after @white juniper told me about because my messages list contains 5 examples which i think will be more than 5-8k tokens .
I studied about how others are doing it like summarizing the previous chat content so model keep responding related to the content .what I think otherwise gpt will forget about the content thats why we have to send the previous messages to it.

sterile ingot Sep 27, 2023, 4:44 PM

#

You include the examples in every prompt

misty viper Sep 27, 2023, 4:44 PM

#

yes

sterile ingot Sep 27, 2023, 4:44 PM

#

Don't treat it as a contious conversation

#

5-8k for examples leaves you plenty of space for responses, way more than enough

misty viper Sep 27, 2023, 4:45 PM

#

do you know python i can explain the thing how i did that in python?

sterile ingot Sep 27, 2023, 4:47 PM

#

yep

misty viper Sep 27, 2023, 5:13 PM

#

well i made a list of examples and i have one system prompt which assigns role as job evaluator and then after iterating over the resume and jd data i pass the data to the generate_messages so my each gpt call has list of messages which contains system_prompt, examples and the last user prompt which has resume_data and job_data on which the model has to run an have to give the ouput.

#

for each resume data and jd_data .i am doing the same thing not appending the previous response just creating a new message_list whcih has system prompt ,examples then resume_data and jd_data

sterile ingot Sep 27, 2023, 7:19 PM

#

can you share the full notebook?

#Resume eval thread