Search Results

Advanced Search Tips

Ask Question

Search type	Search syntax
Tags	[tag]
Exact	"words here"
Author	user:1234 user:me (yours)
Score	score:3 (3+) score:0 (none)
Answers	answers:3 (3+) answers:0 (none) isaccepted:yes hasaccepted:no inquestion:1234
Views	views:250
Code	code:"if (foo != bar)"
Sections	title:apples body:"apples oranges"
URL	url:"*.example.com"
Saves	in:saves
Status	closed:yes duplicate:no migrated:no wiki:no
Types	is:question is:answer
Exclude	-[tag] -apples
For more details on advanced search visit our help page

Results tagged with dataset

Search options answers only not deleted user 12359

7 results

Relevance Newest Score Active

Requests for datasets are off-topic on this site. Use this tag for questions concerning creating, processing, or maintaining datasets.

5 votes

Data Sets suitable for k-means

In complement to JEquihua's great answer, I would like to add 2 points. Case 3 is a nice example of a case where it would be useful to have a clustering algorithm that doesn't give only the cluster a …

Franck Dernoncourt

48.7k

answered Dec 15, 2013 at 19:00

2 votes

Looking for redacted text corpus

For medical data, a few datasets can be found at: Physician notes with annotated PHI 1) i2b2 2006 Deidentification and Smoking Challenge's data set: NLP Data Set #1B: 889 de-identified discharge …

Franck Dernoncourt

48.7k

answered Jun 5, 2020 at 17:33

8 votes

What is exactly meant by a "data set"?

In the open data discipline, dataset is the unit to measure the information released in a public open data repository. The European Open Data portal aggregates more than half a million datasets. …

Franck Dernoncourt

48.7k

answered Nov 5, 2016 at 17:08

5 votes

Plotting data from several files on one plot

One way to do it is to use points: x <- seq(0, 2*pi, len = 51) y1 = sin(x) y2 = cos(x) plot(x, y1) points(x, y2, col = "red") If your data files share a common axis, you can use matplot: a <- ma …

Franck Dernoncourt

48.7k

answered Oct 16, 2013 at 4:11

3 votes

A suitable corpus for training skip-though vectors

Common Crawl corpus: consists of 145 TB of data from 1.81 billion webpages as of August 2015 http://www.lrec-conf.org/proceedings/lrec2018/pdf/889.pdf: see Table 1 for several summarization corpora, …

Franck Dernoncourt

48.7k

answered May 30, 2018 at 3:42

3 votes

Accepted

Why does the Ciphar 10 tutorial on TensorFlow crop the images to be 24x24?

As a side note, the CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. This means that 24x24 cropping keeps most of the image. …

Franck Dernoncourt

48.7k

answered Jan 13, 2017 at 0:29

17 votes

Training data is imbalanced - but should my validation set also be?

The point of the validation set is to select the epoch/iteration where the neural network is most likely to perform the best on the test set. Subsequently, it is preferable that the distribution of cl …

Franck Dernoncourt

48.7k

answered Jan 30, 2017 at 16:09

Stack Exchange Network

Search Results

Data Sets suitable for k-means

Looking for redacted text corpus

What is exactly meant by a "data set"?

Plotting data from several files on one plot

A suitable corpus for training skip-though vectors

Why does the Ciphar 10 tutorial on TensorFlow crop the images to be 24x24?

Training data is imbalanced - but should my validation set also be?

Hot Network Questions