Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
73 views

I am creating a one step look ahead alignment algorithm in a protein alignment context. I am now implementing a seeded option, seeds are also provided to the function, in which the gaps are stripped ...
Jaime Duarte's user avatar
0 votes
1 answer
91 views

I have a list of strings I want to search with UniProt and then get the first entry and its information. My problem is that the following code does return results, but it doesn't return the first ...
Melissa Flassig's user avatar
1 vote
1 answer
72 views

I have a Snakemake pipeline (https://github.com/V-Varga/SPOT-BGC/tree/main), where I generate input and output file names for various intermediate steps using wildcards that refer back to file and ...
Vi_Varga's user avatar
2 votes
1 answer
183 views

I have written a bioinformatics pipeline (https://github.com/V-Varga/SPOT-BGC/tree/main) in Snakemake. While it is has been functional until now, one of the datasets I have tried to use it on is ...
Vi_Varga's user avatar
1 vote
0 answers
47 views

I’m trying to build a BLAST nucleotide database using makeblastdb (NCBI BLAST 2.16.0) inside a Singularity container. My FASTA headers have been renamed to be unique in the format: >file<file#&...
Asad Prodhan's user avatar
2 votes
0 answers
64 views

I was working on some bioinfo task using Python, and used UMAP in this process. Despite the model was fitted in under 20 seconds, transformation failed (or I least I conclude so) given that there was ...
user31535378's user avatar
0 votes
0 answers
158 views

The complete codes and data are available at:Google Disk I'm working on a high-dimensional regression problem and have built a Transformer-based model in PyTorch. While the model trains, I'm observing ...
氢氰酸's user avatar
1 vote
1 answer
53 views

I am working on a Rust script to process BAM files using the rust-htslib crate and expose the functionality to Python using pyo3. My goal is to read an input BAM file (which includes both a header and ...
One thousand's user avatar
3 votes
1 answer
233 views

Why do non-identical inputs to ProtBERT generate identical embeddings when non-whitespace-separated? I've looked at answers here etc. but they appear to be different cases where the slicing of the out....
Maximilian Press's user avatar
3 votes
1 answer
57 views

I’m currently working with a large dataset and need help merging multiple .fasta files. Although I’m not an expert, I’ve attempted to automate this process using a Python script. However, the merging ...
Andrea S.'s user avatar
1 vote
1 answer
43 views

This is the CDS for Pun1 from https://www.ncbi.nlm.nih.gov/datasets/gene/id/107859694/products/ NM_001324769.1:37-1359 LOC107859694 [organism=Capsicum annuum] [GeneID=107859694] [region=cds] ...
alkyl official's user avatar
2 votes
8 answers
189 views

I have a fastq file containing several sequences with headers such as : tail SRR11149706_1.fastq @SRR11149706.16630586 16630586/1 CCCAACAACAACAACAGCAACCTCCTCACGCCAACGCCGATCCCGCCGCTGTTTTCCAA @...
CaroZ's user avatar
  • 149
2 votes
1 answer
75 views

I'm using the lilikoi R package to follow a built-in example from the official documentation. While most of the steps work correctly, I encounter an error when I attempt to run the lilikoi....
Давид Пирић's user avatar
-4 votes
1 answer
146 views

I would like to generate a few different plots in ggplot2 and assemble them in an image tool like MS publisher or Inkscape to get a single publication-ready figure. Ideally, I would like to produce ...
Cobalamin's user avatar
0 votes
1 answer
40 views

I have written a function called "resolve" to help manage inputs for my nextflow DSL2 workflow. It works how I want, but I’d ideally like to put it in a separate utils.nf file and then ...
rbierman's user avatar
0 votes
0 answers
35 views

I'm a beginner here. I've built a few nextflow workflows for other tools before. The command for PSORTb requires you to specify the directory where the output in stored and this is where I feel the ...
Sravan Krishna's user avatar
0 votes
2 answers
137 views

I am working with Gromacs .gro files in PyMol and running into problems with multi-stranded molecules. .Gro files do not have chain identifiers, which PyMol apparently needs to calculate cartoon ...
Erik's user avatar
  • 332
1 vote
0 answers
40 views

I'm doing an assignment for an Ecology course for my master's degree. The instructions are as follows: Using the dataset "tussock" from the FD package seen in the first class: 1- Firstly, ...
Jonas Rosa's user avatar
-1 votes
1 answer
93 views

I am a beginner with using linux bash for bioinformatics purpose and recently i encountered some error with this 'awk' command. ChatGPT suggestion is not helping and the task is very basic. I have a ...
Luka Jašović's user avatar
2 votes
0 answers
101 views

I'm pretty new to Rust and I was trying to write a project to get my hands dirty with the language, really figure it out. I wanted to write a bioinformatics tool that uses a multiple sequence ...
Abhirath Anand's user avatar
0 votes
1 answer
52 views

I'm trying to count the frequency of SNPs every 100,000 base locations. I'm using a VCF file I've already prepared, and my professor showed me to use code such as below: inputfile=open("...
yamianne's user avatar
0 votes
1 answer
102 views

I am trying to get the global bayesian fit and local fit for the raw data curve of a Bio-Layer Interferometry (BLI) experiment. In a BLI experiment, you are first coating the tip of a sensor with ...
Cobalamin's user avatar
0 votes
1 answer
74 views

I'm a beginner and I'm working on a Python script that processes gene expression data, and I'm trying to plot volcano plots for different brain regions (EC, PC, and Hippocampus). However, I keep ...
Farah Yasser's user avatar
0 votes
1 answer
127 views

Say I have a dataframe that dennotes node colors for a given tree (3 nodes for clarity but would like to expand this to 1000 nodes) note_df nodes color node1 #0d3b66 node2 #faf0ca node3 #f4d35e And ...
Sam Degregori's user avatar
0 votes
1 answer
83 views

I would like to convert .mol files into CDD .mmcif files which is the input format of alphafold 3. In the code of AF3, we can find a Python function which enables it. This function uses the Python ...
user30270061's user avatar
0 votes
0 answers
22 views

I am trying to integrate jbrowse into an existing platform. The current platform provides me with values such as the bin size, bpperpixel and start and end position. Is there a way to adjust jbrowse ...
david dami's user avatar
1 vote
1 answer
76 views

Following this post I managed to put together a small function to place within a bigger text body (FASTA) shorter strings determined from another file based on some conditions (e.g. 100 events from a ...
Matteo's user avatar
  • 435
0 votes
0 answers
114 views

Terrible title, and I'll update if a more effective way of asking can be suggested. Problem We're running a bioinformatics pipeline that uses FastQC for quality control. The pipeline is written in ...
GilChrist19's user avatar
1 vote
2 answers
108 views

This is one of my first tasks with actual Python code. What I need to do is to import a set of (FASTA) sequences, select those with a length between 400 and 500 (base pairs) characters, and randomly ...
Matteo's user avatar
  • 435
0 votes
1 answer
193 views

I've been working with AlphaFold 3 on a Linux HPC, and I've been trying to use Posebusters to evaluate the results of AlphaFold 3 by comparing the predicted structures with the ground truth structures ...
melee's user avatar
  • 113
0 votes
0 answers
42 views

I have an integrated Seurat object with approximately ~480k cells, integrated using the sketch-based method detailed here (leveraging the on-disk storage capabilities of BPCells). I keep getting this ...
user29769413's user avatar
0 votes
0 answers
96 views

I am a beginner in R, and have only been doing it for a couple weeks. Recently I have been trying to engage with more advanced R material for my work in bioinformatics. I found out about ggplot and ...
user29964013's user avatar
-1 votes
1 answer
55 views

I am interested in running Uniprot's Protein descriptor model, ProtNLM, to add some bonus descriptors for a big chunk of protein sequence I have. They have a trial notebook available here. Here is the ...
lunchbox7804's user avatar
1 vote
1 answer
76 views

I am trying to write a script analyzing codon usage in sequence utilizing the codon-bias package. I am trying to use the class codonbias.scores.FrequencyOfOptimalCodons, but when I do so in my code: ...
Shlomo Goren's user avatar
1 vote
0 answers
60 views

I am developing an MRI file viewer app using dash and plotly. The way it works is I can select a specific MRI file from my dataset and the app will generate a slider that could take you through the ...
Malek kchaou's user avatar
2 votes
2 answers
240 views

I am curious whether I am missing something in the Polars Expression library in how this could be done more efficiently. I have a dataframe of protein sequences, where I would like to create k-long ...
Olga Botvinnik's user avatar
0 votes
1 answer
63 views

could anyone suggest what is wrong in this snakefile code; I am trying to learn snakemake so could you please suggest any useful resources to read more about snakmake.I will be thankful for all your ...
bioinfonext's user avatar
2 votes
2 answers
35 views

I got a fasta file assembled from RNA-seq data like this: >ENSP00000493376.2|ENST00000641515.2|ENSG00000186092.7|OTTHUMG00000001094.4|OTTHUMT00000003223.4|OR4F5-201|OR4F5|326 ...
Ulrike Resch's user avatar
1 vote
0 answers
55 views

I am modifying bpps.py in the Arnie Python package to integrate LinearFold for RNA secondary structure predictions. However, I keep encountering "KeyError: 'linearfold_v'", and issues when ...
Hugh Redford's user avatar
1 vote
1 answer
122 views

I have a channel which emit as follows: [[A, B, C, D], 1] [[A, B, C, D], 2] [[A, B, C], 3] [[A, B, V], 4] [[A], 5] [[A1, B1, D], 7] [[A1, B1, D], 8] I have another parameter defined by the user. The ...
Ahkam's user avatar
  • 13
0 votes
0 answers
28 views

I am working on a tandem repeat project and I want to define a repeated motif that is complex, including indels and substitutions, with most bases being conserved. The motif varies in length between ...
Grégoire Blavier's user avatar
1 vote
1 answer
78 views

I was trying to created a snakemake script to automate 3 tasks. The first one edits a .seg file in order to be the correct input for the next rule, the second rule computes an analysis for ...
Alessandra Bonilla Salon's user avatar
0 votes
0 answers
44 views

I'm trying to use the Arnie Python package for RNA folding with ViennaRNA and LinearFold. However, even after modifying utils.py to include debugging statements and manually add ViennaRNA, Arnie does ...
Hugh Redford's user avatar
1 vote
1 answer
105 views

So, I have 3 .txt files according to the three categories of gene enrichment I downloaded from the GO platform and they just can't be read in R, I think it's due to the inconsistent columns. First I ...
Julieta González's user avatar
0 votes
0 answers
30 views

I need to write a customized Jalview annotation to colour (for example in green) different positions for each sequence_id (on each row). I need an example (I've already checked the doc). Help, Thx. ...
mscr's user avatar
  • 1
0 votes
0 answers
34 views

I wanted to see if anyone here has experience using the iNaturalist API because I'm having an issue with it. I built a web page using Python, but I'm not sure if I'm making the request incorrectly or ...
Adolfo Morales's user avatar
0 votes
1 answer
128 views

I am new to anndata and would like to know if an issue that i am running into expected or not. I have 28 h5ad files (Tabula Sapiens)(https://figshare.com/articles/dataset/Tabula_Sapiens_v2/27921984), ...
Danish Zahid Malik's user avatar
1 vote
1 answer
108 views

I am trying to parallelize my code using HPX in order to improve performance. Below is the original code and my attempt to refactor it using HPX. Original Code: std::vector<std::vector<std::pair&...
tlparolin's user avatar
0 votes
0 answers
35 views

I have been tasked with optimizing the productivity of Park and Ramirez's Bioreactor using TensorFlow. To achieve this, I generate a dataset by creating random values for the "Feed" variable,...
Bernardo Ribeiro's user avatar
1 vote
0 answers
213 views

After running minimap2 -ax asm5 --eqx with two fasta files that are both hifiasm assemblies scaffolded to the same reference via ragtag, minimap crashes with the following output: [M::mm_idx_gen::25....
altuffin's user avatar

1
2 3 4 5
90