5,697 questions
170
votes
3
answers
44k
views
How can you diff two pipelines in Bash?
How can you diff two pipelines without using temporary files in Bash? Say you have two command pipelines:
foo | bar
baz | quux
And you want to find the diff in their outputs. One solution would ...
147
votes
17
answers
75k
views
Functional pipes in python like %>% from R's magrittr
In R (thanks to magrittr) you can now perform operations with a more functional piping syntax via %>%. This means that instead of coding this:
> as.Date("2014-01-01")
> as.character(...
140
votes
25
answers
628k
views
How can I fix "kex_exchange_identification: read: Connection reset by peer"?
I want to copy data with scp in a GitLab pipeline using PRIVATE_KEY.
The error is:
kex_exchange_identification: read: Connection reset by peer
Connection reset by x.x.x.x port 22
lost connection
...
93
votes
6
answers
97k
views
Sklearn Pipeline: Get feature names after OneHotEncode In ColumnTransformer
I want to get feature names after I fit the pipeline.
categorical_features = ['brand', 'category_name', 'sub_category']
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(...
91
votes
3
answers
44k
views
What is the difference between pipeline and make_pipeline in scikit-learn?
I got this from the sklearn webpage:
Pipeline: Pipeline of transforms with a final estimator
Make_pipeline: Construct a Pipeline from the given estimators. This is a shorthand for the Pipeline ...
73
votes
1
answer
72k
views
How to extract tar archive from stdin?
I have a large tar file I split. Is it possible to cat and untar the file using pipeline.
Something like:
cat largefile.tgz.aa largefile.tgz.ab | tar -xz
instead of:
cat largefile.tgz.aa largfile....
69
votes
4
answers
143k
views
GitLab CI Pipeline on specific branch only
I'm trying to implement GitLab CI Pipelines to build and deploy an Angular app. In our project we have two general branches: master (for production only) and develop. For development we create feature/...
64
votes
2
answers
80k
views
Getting model attributes from pipeline
I typically get PCA loadings like this:
pca = PCA(n_components=2)
X_t = pca.fit(X).transform(X)
loadings = pca.components_
If I run PCA using a scikit-learn pipeline:
from sklearn.pipeline import ...
55
votes
14
answers
68k
views
Need to perform AWS calls for account xxx, but no credentials have been configured
I'm trying to deploy my stack to aws using cdk deploy my-stack. When doing it in my terminal window it works perfectly, but when im doing it in my pipeline i get this error: Need to perform AWS calls ...
45
votes
2
answers
34k
views
Gitlab pipeline - reports config contains unknown keys: cobertura
I'm not able run the gitlab pipeline due to this error
Invalid CI config YAML file
jobs:run tests:artifacts:reports config contains unknown keys: cobertura
45
votes
6
answers
59k
views
Pipe complete array-objects instead of array items one at a time?
How do you send the output from one CmdLet to the next one in a pipeline as a complete array-object instead of the individual items in the array one at a time?
The problem - Generic description
As ...
45
votes
3
answers
31k
views
How to insert Keras model into scikit-learn pipeline?
I'm using a scikit-learn custom pipeline (sklearn.pipeline.Pipeline) in conjunction with RandomizedSearchCV for hyper-parameter optimization. This works great.
Now I would like to insert a keras model ...
39
votes
7
answers
32k
views
Share gitlab-ci.yml between projects
We are thinking to move our ci from jenkins to gitlab. We have several projects that have the same build workflow. Right now we use a shared library where the pipelines are defined and the jenkinsfile ...
39
votes
2
answers
25k
views
What exactly is a dual-issue processor?
I came across several references to the concept of a dual issue processor (I hope this even makes sense in a sentence). I can't find any explanation of what exactly dual issue is. Google gives me ...
39
votes
8
answers
31k
views
How to properly pickle sklearn pipeline when using custom transformer
I am trying to pickle a sklearn machine-learning model, and load it in another project. The model is wrapped in pipeline that does feature encoding, scaling etc. The problem starts when i want to use ...
36
votes
7
answers
35k
views
How do you determine if WPF is using Hardware or Software Rendering?
I'm benchmarking a WPF application on various platforms and I need an easy way to determine if WPF is using hardware or software rendering.
I seem to recall a call to determine this, but can't lay ...
36
votes
7
answers
27k
views
"Piping" output from one function to another using Python infix syntax
I'm trying to replicate, roughly, the dplyr package from R using Python/Pandas (as a learning exercise). Something I'm stuck on is the "piping" functionality.
In R/dplyr, this is done using the pipe-...
35
votes
4
answers
23k
views
How to access scrapy settings from item Pipeline
How do I access the scrapy settings in settings.py from the item pipeline. The documentation mentions it can be accessed through the crawler in extensions, but I don't see how to access the crawler in ...
34
votes
8
answers
39k
views
Why am I getting "Pipeline failed due to the user not being verified" & "Detached merge request pipeline" on a Gitlab merge request?
When a non-owner dev pushes a branch to our Gitlab repo, it returns a "pipeline failed" message, with the detail "Pipeline failed due to the user not being verified". On the dev's ...
33
votes
2
answers
29k
views
Extract the second element of a tuple in a pipeline
I want to be able to extract the Nth item of a tuple in a pipeline, without using with or otherwise breaking up the pipeline. Enum.at would work perfectly except for the fact that a tuple is not an ...
32
votes
2
answers
32k
views
How to use Github Release Version Number in Github Action
I have created a Github repo that has got an action to build the npm package and publish it to npmjs.com. The trigger for my action is the creation of a new release in Github. When creating the new ...
31
votes
1
answer
30k
views
pipeline in docker exec from command line and from python api
What I try to implement is invoking mysqldump in container and dump the database into the container's own directory.
At first I try command below:
$ docker exec container-name mysqldump [options] ...
31
votes
4
answers
5k
views
Do function pointers force an instruction pipeline to clear?
Modern CPUs have extensive pipelining, that is, they are loading necessary instructions and data long before they actually execute the instruction.
Sometimes, the data loaded into the pipeline gets ...
30
votes
4
answers
35k
views
Run a program in a ForEach loop
I'm trying to get this simple PowerShell script working, but I think something is fundamentally wrong. ;-)
ls | ForEach { "C:\Working\tools\custom-tool.exe" $_ }
I basically want to get ...
30
votes
6
answers
11k
views
Assign intermediate output to temp variable as part of dplyr pipeline
Q: In an R dplyr pipeline, how can I assign some intermediate output to a temp variable for use further down the pipeline?
My approach below works. But it assigns into the global frame, which is ...
29
votes
4
answers
77k
views
how to use xargs with sed in search pattern
I need to use the output of a command as a search pattern in sed. I will make an example using echo, but assume that can be a more complicated command:
echo "some pattern" | xargs sed -i 's/{}/...
29
votes
11
answers
20k
views
How to extract best parameters from a CrossValidatorModel
I want to find the parameters of ParamGridBuilder that make the best model in CrossValidator in Spark 1.4.x,
In Pipeline Example in Spark documentation, they add different parameters (numFeatures, ...
28
votes
4
answers
8k
views
-> operator in Clojure
Is the -> operator in Clojure (and what is this operator called in Clojure-speak?) equivalent to the pipeline operator |> in F#? If so, why does it need such a complex macro definition, when (|>) is ...
28
votes
1
answer
19k
views
Notify all group members of failed pipelines in GitLab
The goal is to have everyone get a notification for every failed pipeline (at their discretion). Currently, any of us can run a pipeline on this project branch, and the creator of the pipeline gets an ...
28
votes
2
answers
16k
views
Renovate: Combine all updates to one branch/PR
Renovate is updating the packages as soon as there is a new version. But renovate also creates a seperate PR/branch for each update. So if new versions released for 5 of my packages renovate will ...
28
votes
2
answers
19k
views
How to set system path variable in github action workflow
I was wondering how I can set the system path variables in the GitHub actions workflow.
export "$PATH:$ANYTHING/SOMETHING:$AA/BB/bin"
27
votes
10
answers
45k
views
Pipeline OrdinalEncoder ValueError Found unknown categories
Please take it easy on me. I’m switching careers into data science and don’t have a CS or programming background—so I could be doing something profoundly stupid. I've researched for a few hours ...
27
votes
7
answers
99k
views
Singleton array array(<function train at 0x7f3a311320d0>, dtype=object) cannot be considered a valid collection
Not sure how to fix . Any help much appreciate. I saw thi Vectorization: Not a valid collection but not sure if i understood this
train = df1.iloc[:,[4,6]]
target =df1.iloc[:,[0]]
def train(...
27
votes
3
answers
39k
views
Put customized functions in Sklearn pipeline
In my classification scheme, there are several steps including:
SMOTE (Synthetic Minority Over-sampling Technique)
Fisher criteria for feature selection
Standardization (Z-score normalisation)
SVC (...
27
votes
3
answers
36k
views
YAML_FILE_ERROR: YAML file does not exist
I'm trying to implement a pipeline on AWS, but I get an error:
YAML_FILE_ERROR: YAML file does not exist
I don't know why. I'm using github repo for mean stack project, entry file is docker-compose. ...
26
votes
3
answers
25k
views
Is the following possible in PowerShell: "Select-Object <Property>.<SubProperty>"?
The scenario: I'm using Select-Object to access properties of a piped object, and one of those properties is itself an object. Let's call it PropertyObject. I want to access a property of that ...
26
votes
3
answers
19k
views
CI/CD pipeline with PostgreSQL failed with "Database is uninitialized and superuser password is not specified" error
I'm using Bitbucket pipeline with PosgreSQL for CI/CD. According to this documentation PostgreSQL service has been described in bitbucket-pipelines.yml this way:
definitions:
services:
postgres:...
25
votes
3
answers
16k
views
sklearn pipeline - how to apply different transformations on different columns
I have a dataset that has a mixture of text and numbers i.e. certain columns have text only and rest have integers (or floating point numbers).
I was wondering if it was possible to build a pipeline ...
25
votes
9
answers
8k
views
Inform right-hand side of pipeline of left-side failure?
I've grown fond of using a generator-like pattern between functions in my shell scripts. Something like this:
parse_commands /da/cmd/file | process_commands
However, the basic problem with this ...
25
votes
4
answers
40k
views
How to perform string manipulation while declaring env vars in GitHub Actions
I have a github repository like the following
johndoe/hello-world
I am trying to set the following environment variables in github actions
env:
DOCKER_HUB_USERID: ${{ github.actor }}
...
25
votes
1
answer
38k
views
how to compare two fields in a document in pipeline aggregation (mongoDB) [duplicate]
I have a document like below :
{
"user_id": NumberLong(1),
"updated_at": ISODate("2016-11-17T09:35:56.200Z"),
"created_at": ISODate("2016-11-17T09:35:07.981Z"),
"banners": {
"...
25
votes
3
answers
53k
views
Unix tr command to convert lower case to upper AND upper to lower case
So I was searching around and using the command tr you can convert from lower case to upper case and vice versa. But is there a way to do this both at once?
So:
$ tr '[:upper:]' '[:lower:]' or $ tr ...
24
votes
5
answers
88k
views
Required context class hudson.FilePath is missing Perhaps you forgot to surround the code with a step that provides this, such as: node
When i load another groovy file in Jenkinsfile it show me following error.
"Required context class hudson.FilePath is missing
Perhaps you forgot to surround the code with a step that provides this, ...
23
votes
3
answers
10k
views
Performance of x86 rep instructions on modern (pipelined/superscalar) processors
I've been writing in x86 assembly lately (for fun) and was wondering whether or not rep prefixed string instructions actually have a performance edge on modern processors or if they're just ...
23
votes
3
answers
31k
views
How to pass a parameter to only one part of a pipeline object in scikit learn?
I need to pass a parameter, sample_weight, to my RandomForestClassifier like so:
X = np.array([[2.0, 2.0, 1.0, 0.0, 1.0, 3.0, 3.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
23
votes
3
answers
1k
views
Haskell performance implementing unix's "cat" program with Data.ByteString
I have the following Haskell code, implementing a simple version of the "cat" unix command-line utility. Testing performance with "time" on a 400MB file, it's about 3x slower. (the exact script I am ...
23
votes
6
answers
24k
views
Output binary data on PowerShell pipeline
I need to pipe some data to a program's stdin:
First 4 bytes are a 32-bit unsigned int representing the length of the data. These 4 bytes are exactly the same as C would store an unsigned int in ...
23
votes
2
answers
9k
views
how to use GNU Time with pipeline
I want to measure the running time of some SQL query in postgresql. Using BASH built-in time, I could do the following:
$ time (echo "SELECT * FROM sometable" | psql)
I like GNU time, which provides ...
22
votes
2
answers
17k
views
IIS7 Integrated vs Classic Pipeline - which uses more ASP.NET threads?
With integrated pipeline, all requests are passed through ASP.NET, including images, CSS.
Whereas, in classic pipeline, only requests for ASPX pages are by default passed through ASP.NET.
Could ...
21
votes
7
answers
15k
views
Scrapy, Python: Multiple Item Classes in one pipeline?
I have a Spider that scrapes data which cannot be saved in one item class.
For illustration, I have one Profile Item, and each Profile Item might have an unknown number of Comments. That is why I want ...