#make-data-count-finding-data-references | Kaggle | Page 1

true lotus Jun 11, 2025, 6:21 PM

#

:o

tawny condor Jun 15, 2025, 12:08 AM

#

Hello.

Can you clarify, in the examples in the data tab, what evidence was used to determine that the secondary sources are indeed secondary?

For pdb 5yfp is it because the text was in the introduction?

Although the refering paper is far from my field, E-MTAB-10217 seems primary to me.

From https://doi.org/10.3389/fimmu.2021.690817 (Papoutsopoulou, Stamatia, et al. "Impact of interleukin 10 deficiency on intestinal epithelium responses to inflammatory signals." Frontiers in immunology 12 (2021): 690817.):

RNA Sequencing
Host transcriptome analysis was performed by RNA sequencing of unstimulated and TNF (40 ng/ml) stimulated enteroid cultures from C57BL/6J mice (N = 3). RNA extraction and purification from enteroids were performed using the RNeasy mini kit (Qiagen), as per manufacturer's instructions. Strand-specific sequencing libraries were prepared with the TruSeq stranded Total RNA kit (Illumina) from 1 µg total RNA of each sample and sequenced on an Illumina HiSeq2000 (100-nucleotide paired-end reads)

It reads to me like the RNA sequencing results were explicitly generated from mouse intestinal enteroids that were experimentally stimulated with TNF specifically for this study.

Thanks

Frontiers | Impact of Interleukin 10 Deficiency on Intestinal Epith...

Interleukin 10 (IL-10) is a pleiotropic, anti-inflammatory cytokine that has a major protective role in the intestine. Although its production by cells of th...

tawny condor Jun 15, 2025, 6:47 PM

#

tawny condor Hello. Can you clarify, in the examples in the data tab, what evidence was use...

I could make a similar argument for the other data source in the Papoutsopoulou, Stamatia, et al. paper(PRJE43395):

Assay for Transposase-Accessible Chromatin Sequencing
The ATAC sequencing protocol was based on the protocol of Buenrostro et al. (31) and modified for the specific cell type, as described below. Enteroid cultures were maintained in 24-well plates, as described above, and they were either left unstimulated or they were treated with 40 ng/ml TNF for 2 h (four wells per condition). At the end of stimulation, the medium was removed, the plate was transferred on ice, and 1 ml cold PBS was added in each well. [...]

I reads to me like the authors similarly directly generated the data by treating intestinal organoids with TNF to identify chromatin accessibility regions within the scope of this specific investigation.

#

<@&1303433601177751593>

tawny condor Jun 16, 2025, 9:01 AM

#

Also, what pdf reading library is available to us if we have to turn off the internet for submission?

tawny condor Jun 16, 2025, 9:22 AM

#

tawny condor Also, what pdf reading library is available to us if we have to turn off the int...

I found the discussion that answered this other more baisc question in the discussion here

I guess the discussion is more active than the dicord. I will ask my previous question in the discussion.

Make Data Count - Finding Data References

Identify scientific data use in papers and classify how they are mentioned.

native cloud Jun 26, 2025, 2:22 PM

#

Do we need to use models which are under models section which are mostly qwen versions or can we use any other models like gemma, llama

ivory cedar Jun 29, 2025, 5:47 PM

#

native cloud Do we need to use models which are under models section which are mostly qwen ve...

nah, you can use any freely & publically accessible model

inland vapor Jul 3, 2025, 10:15 AM

#

Is there any update regarding dataset?

heady lynx Jul 8, 2025, 11:06 PM

#

inland vapor Is there any update regarding dataset?

Here the new modified training labels

📎 new_training_labels.csv

#

Btw i'm looking for someone to team up with

inland vapor Jul 10, 2025, 10:25 PM

#

heady lynx Btw i'm looking for someone to team up with

Can you share your kaggle profile?

heady lynx Jul 10, 2025, 11:27 PM

#

https://www.kaggle.com/bossjack

OMX

violet sphinx Jul 16, 2025, 2:42 PM

#

hello

umbral vault Aug 26, 2025, 1:56 PM

#

I use MinerU to extract text from PDFs, it might help to have a good data

https://www.kaggle.com/datasets/omiderfanmanesh/make-data-count-dataset-mineru-extraction

you can download it from here

upper steeple Sep 8, 2025, 1:04 PM

#

Hi there, has anyone run into issues with their notebook being marked as failed even though their code ran successfully? It seems to be running into a papermill exception when converting the notebook after code execution. Unfortunately, it is marking all of my notebook versions as failed.

forest barn Nov 7, 2025, 6:59 PM

#

upper steeple Hi there, has anyone run into issues with their notebook being marked as failed ...

I updated one of my projects today and it worked, but I couldn't identify what you mentioned in my projects.