Skip to main content

Questions tagged [data-mining]

Using the techniques of artificial intelligence and machine learning to extract patterns from large data sets and transforming those data into a useful, organized form for future processing.

Filter by
Sorted by
Tagged with
0 votes
0 answers
64 views

I have a simple question of how would you measure the logicality of a programming language? EDIT: I was asked to specify the term "logicality". Hence I will try and provide a stipulation. By ...
Shawn W.'s user avatar
1 vote
1 answer
173 views

I saw a post on Reddit (https://www.reddit.com/r/math/comments/ci50d3/visualizing_mathematical_subjects/) that utilizes label propagation, Fruchterman-Reingold algorithm, and edge betweenness ...
ZENG's user avatar
  • 113
1 vote
1 answer
75 views

I have used different machine learning algorithms to predict solar panels' power output. There are ten independent features for weather data. In all models, I set time as an index and have used the ...
graphicart86's user avatar
1 vote
3 answers
2k views

We were doing project work for plagiarism checking. For this purpose, we have taken a term frequency vector of two documents and measured the similarity using a cosine similarity measure. The value of ...
Tushar Saha's user avatar
1 vote
1 answer
58 views

I am working on building ML/DL solution for a problem where that data is considered, naturally similar and I am worried if that would be considered as data redundancy. My question is, is that so? and ...
Luka's user avatar
  • 11
0 votes
0 answers
63 views

I'm currently going through past paper questions and was wondering if I could get some help answering this one? 'Consider a classification model which is applied to a set of records, of which 100 ...
curiousCoder's user avatar
0 votes
1 answer
184 views

I am working on a Fraudulent Cash Transaction Detection System using DBSCAN and I want to know what is the proper way to identify outliers? Thank you ##Edite## I had a problem how to represent the ...
Xx_22's user avatar
  • 1
0 votes
0 answers
66 views

I'm a student studying a data mining course and have come across a problem. I need to explain the problem with the help of an example scenario as I do not know how to explain the problem in any other ...
mahesh Rao's user avatar
3 votes
1 answer
236 views

In p. 7 of the book "Introduction to Information Retrieval" (by Manning et al), the authors explain how, given a collection of text documents, an inverted index is built by tokenizing, then ...
jm jm's user avatar
  • 31
1 vote
0 answers
65 views

I have developed a locality sensitive hashing algorithm for the 3-way or k-way dot product. When I say 3-way dot product I mean the following. Suppose we have $x,y,z \in [-1,1]^{S}$ for $S \in \...
Mihir Mongia's user avatar
2 votes
2 answers
305 views

Hi in the data mining and machine learning course that I'm taking there is a subject on feature spaces and there is this part about feature vector aggregation and metric spaces that I don't really ...
Mads's user avatar
  • 21
-1 votes
1 answer
96 views

Let's suppose that we have the following 2 tables: If we want to reduce the dimension by one(in every table) which feature we should remove and why ? I am confused about the way that i should work ...
Emily Serone's user avatar
3 votes
1 answer
63 views

There are $N < 3\times10^4$ 3D points. At least 50% of them lie approximately in the same plane, i.e. the distance between the plane and each point is at most $p$. Find such a plane. Attempt: since ...
Ignacio's user avatar
  • 133
2 votes
0 answers
54 views

The typical workflow in topological data analysis is from point cloud data to filtration to a list of bar codes corresponding to each dimension. A filtration is a sequence of simplicial complexes, ...
Eben Kadile's user avatar
0 votes
0 answers
51 views

I am working on a Systematic Literature Review (SLR) and about to done with data synthesis. After SLR, I want to create an Ontology and include different details of the SLR in Ontology. I have almost ...
Khan's user avatar
  • 1
1 vote
0 answers
174 views

I currently hold a bachelors in Computer science and a masters in Art History. I really want to combine the two and I know of Digital Humanities but I'm not completely aware of where Digital Humanists ...
dcs1's user avatar
  • 11
0 votes
0 answers
127 views

I am looking for a reference explaining how to solve Navier-Stokes numerically using Machine learning algorithms . Thank you in advance for your help .
ABRAICH Ayoub's user avatar
4 votes
1 answer
339 views

I'm looking for a way to recognize and possibly extract source code from text files that may contain only source code, source code mixed with plain text or just plain text without any source code. ...
Marv's user avatar
  • 143
1 vote
0 answers
627 views

It is common in data science to receive two equal length vectors (array of dimension 1), say Categories and Weights. We aim to find all unique values of Categories and sum up the corresponding ...
xiaodai's user avatar
  • 131
1 vote
0 answers
44 views

I have an extremely large (100GB+) corpus of many different texts. All of them are in English and 'well' formatted. They are not loaded into any kind of database, think of them as a huge collection of ...
Alex Morales's user avatar
1 vote
0 answers
332 views

I have gone through many algorithms including streaming k-means, CluStream etc and they all have their pros and cons. What is the best performing algorithm in terms of Computational Complexity Memory ...
Cybernix's user avatar
0 votes
1 answer
959 views

A receipt is an array of products. I have an array of receipts. I need to generate a report in where I can find the products often bought together. For instance, for a single receipt where the ...
Berry's user avatar
  • 101
2 votes
2 answers
256 views

I have data like this. [0 1 0 1 0] [0 1 0 1 0 1 1] [0 1 0 1 ] [0 1 0 1 0 1 1 1 1 0] ... I want to classify with Neural Network but my data different size . I can ...
user572575's user avatar
12 votes
5 answers
20k views

The general question, as the title suggests, is: What is the difference between DS and OR/optimization. On a conceptual level I understand that DS tries to extract knowledge from the available data ...
PsySp's user avatar
  • 261
1 vote
1 answer
244 views

Let's say that I have $N$ data sets where I have data points at some fixed frequency, such as "daily". What would be a good method for finding correlation between any of the data sets, or choosing a ...
Alan Wolfe's user avatar
  • 1,368