0

I've problem with matplotlib.pyplot.scatter.

Firstly, I need to download the data on Iris classification and paste headlines.

        import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    plt.style.use('seaborn')
    
    %matplotlib inline
    
    df = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header = None)
    df_names = ['sepal length in cm', 'sepal width in cm', 'petal length in cm', 'petal width in cm', 'class']
    df.columns = df_names
    df

Secondly, I should Create a scatterplot of data using matplotlib.pyplot.scatter in a following manner:

    * for x and y coordinates use sepal length and width respectively
    * for size use the petal length
    * for alpha (opacity/transparency) use the petal width
    * illustrate iris belonging to each class by using 3 distinct colours (RGB for instance, but be creative if you want)
    * *some columns will need to be scaled, to be passed as parameters; you might also want to scale some other columns to increase the readability of the illustration.

Then, I found this site: https://www.geeksforgeeks.org/matplotlib-pyplot-scatter-in-python/

After that, I uses their draft for my tasks:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

plt.style.use('seaborn')

%matplotlib inline

# dataset-df

x1 = [4.3, 7.9, 5.84, 0.83, 0.7826]

y1 = [2.0, 4.4, 3.05, 0.43, -0.4194]
 
plt.scatter(x1, y1, c ="red",
            alpha = 1.0, 6.9, 3.76, 1.76, 0.9490,
            linewidth = 2,
            marker ="s",
            s = [1.0, 6.9, 3.76, 1.76, 0.9490])
  
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()

However, I always get this error:

File "C:\Users\felix\AppData\Local\Temp/ipykernel_32284/4113309647.py", line 21
    s = [1.0, 6.9, 3.76, 1.76, 0.9490])
                                      ^
SyntaxError: positional argument follows keyword argument

Could you advise me on how to sort it out this problem and complete my task?

In addition, I copied the data from iris.names:

1. Title: Iris Plants Database
    Updated Sept 21 by C.Blake - Added discrepency information

2. Sources:
     (a) Creator: R.A. Fisher
     (b) Donor: Michael Marshall (MARSHALL%[email protected])
     (c) Date: July, 1988

3. Past Usage:
   - Publications: too many to mention!!!  Here are a few.
   1. Fisher,R.A. "The use of multiple measurements in taxonomic problems"
      Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions
      to Mathematical Statistics" (John Wiley, NY, 1950).
   2. Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
      (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
   3. Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
      Structure and Classification Rule for Recognition in Partially Exposed
      Environments".  IEEE Transactions on Pattern Analysis and Machine
      Intelligence, Vol. PAMI-2, No. 1, 67-71.
      -- Results:
         -- very low misclassification rates (0% for the setosa class)
   4. Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE 
      Transactions on Information Theory, May 1972, 431-433.
      -- Results:
         -- very low misclassification rates again
   5. See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al's AUTOCLASS II
      conceptual clustering system finds 3 classes in the data.

4. Relevant Information:
   --- This is perhaps the best known database to be found in the pattern
       recognition literature.  Fisher's paper is a classic in the field
       and is referenced frequently to this day.  (See Duda & Hart, for
       example.)  The data set contains 3 classes of 50 instances each,
       where each class refers to a type of iris plant.  One class is
       linearly separable from the other 2; the latter are NOT linearly
       separable from each other.
   --- Predicted attribute: class of iris plant.
   --- This is an exceedingly simple domain.
   --- This data differs from the data presented in Fishers article
    (identified by Steve Chadwick,  [email protected] )
    The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa"
    where the error is in the fourth feature.
    The 38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa"
    where the errors are in the second and third features.  

5. Number of Instances: 150 (50 in each of three classes)

6. Number of Attributes: 4 numeric, predictive attributes and the class

7. Attribute Information:
   1. sepal length in cm
   2. sepal width in cm
   3. petal length in cm
   4. petal width in cm
   5. class: 
      -- Iris Setosa
      -- Iris Versicolour
      -- Iris Virginica

8. Missing Attribute Values: None

Summary Statistics:
             Min  Max   Mean    SD   Class Correlation
   sepal length: 4.3  7.9   5.84  0.83    0.7826   
    sepal width: 2.0  4.4   3.05  0.43   -0.4194
   petal length: 1.0  6.9   3.76  1.76    0.9490  (high!)
    petal width: 0.1  2.5   1.20  0.76    0.9565  (high!)

9. Class Distribution: 33.3% for each of 3 classes.
1
  • Why alpha is assigned with alpha = 1.0, 6.9, 3.76, 1.76, 0.9490? It should have got just a number. Commented Mar 18, 2022 at 22:30

1 Answer 1

1

There is no problem with iris datasets, just with the part you defined the alpha argument in the scatter function. You should change the way of assigning value to arguments in the way you did:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

plt.style.use('seaborn')

%matplotlib inline

# dataset-df

x1 = [4.3, 7.9, 5.84, 0.83, 0.7826]

y1 = [2.0, 4.4, 3.05, 0.43, -0.4194]
 
plt.scatter(x1, y1, c ="red",alpha = 1,
            linewidth = 2,
            marker ="s",
            s = [1.0, 6.9, 3.76, 1.76, 0.9490])
  
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()

Note that, alpha just takes one number which might be 0.9, 0.8 or even 0.823425, and not a list or anything else.

Output

Your desire output

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.