1

I am attempting to feed .csv data into python.

This is what my code looks like:

import csv

revenue = {}

prices = []
for i in range(1,21):
    prices.append(i)
prices = tuple(prices) #convert to tuple to make immutable and faster   

with open("test.csv") as file_handle:
    file_reader = csv.reader(file_handle)
    file_handle.readline()
    file_handle.readline()  #skip first 2 lines due to column header titles
    for row in file_reader:
        revenue[prices] = row[1] #assign revenue at each price-point

print revenue

print revenue[10]

This is what the .csv data looks like, or my input.

0.01    1397371
0.02    1350610
0.03    1306431
0.04    1136959
0.05    1106950
0.06    1064727
0.07    1037497
0.08    1030768
0.09    976552
0.1     963091
0.11    949641
0.12    917551
0.13    884734
0.14    878675
0.15    775261
0.16    765643
0.17    756057
0.18    733458
0.19    723077
0.2     654178

First column is prices, and second column is revenue. Because my selection of prices are always the same, I actually ignored the data, and simply created a prices list in integer-form, which I converted to a tuple (since I read that if the data is immutable, tuples will process more quickly).

PROBLEM: if I print revenue[10] I want to see 963091. Instead I get KeyError: 20.

And when I print revenue, I expected all prices and associated revenues to be printed, instead, I get the entire price list printed, followed by the final revenue value for the last price in the list (0.2), 654178.

I'm new to python so I apologize for rookie questions, I've been reading and trying to figure this out and I'm still struggling - any advice on my approach is also welcomed, I can use all the help I can get.

Thanks in advance!

ploo
  • 667
  • 3
  • 12
  • 26
  • `revenue` is a dictionary and it doesn't contain a key whose value is 10. Based on what you are looking for, you need to define revenue as a list: `revenue = []` and later use `revenue.append(row[1])` – linuxfan Sep 30 '14 at 18:34
  • how can the `Keyerror` be `20` when you are using `10`? Add the actual traceback from the error – Padraic Cunningham Sep 30 '14 at 18:35
  • You are using the whole `prices` tuple as a key. That's just *one key* in your dictionary. – Martijn Pieters Sep 30 '14 at 18:35
  • Also possibly you should look at related question [reading two column csv as dict](http://stackoverflow.com/questions/17870022/read-two-column-csv-as-dict-with-1st-column-as-key). – nes Sep 30 '14 at 18:39
  • Apologies for that, Keyerror is 10, that was a typo. – ploo Sep 30 '14 at 18:40

3 Answers3

1

revenue[prices] = row[1] is not inserting the value of row (which is a single-item list), but instead using the tuple itself.

>>> revenue[prices] = ''
>>> revenue
{(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20): ''}

with open("test.csv") as file_handle:
    file_reader = csv.reader(file_handle)
    file_handle.readline()
    file_handle.readline()  #skip first 2 lines due to column header titles
    for row in file_reader:
        revenue[int(float(row[0].split()[0]) * 100)] = row[0].split()[1] #assign revenue at each price-point

To convert the row into the price and the revenue, take the first item (the only), split it into the price and revenue, and convert the decimal price to an int.

Celeo
  • 5,583
  • 8
  • 39
  • 41
  • Thanks for your reply Celeo, do you have a suggestion as to how to go about associating the value with each key, in this case the price? – ploo Sep 30 '14 at 18:46
0

This is because revenue is declared as a dictionary.Whenever you are accessing elements in a dictionary you are supposed to use the key rather than a number. Hence in this case revenue[price[10]]. Mind you tuples start from 0 to n. Hence price[10] will actually be the 11th element

Also you are passing revenue[price] = row[1] will pass the entire row to revenue price.

acquayefrank
  • 474
  • 5
  • 21
  • I thought that because it is looping through the input, that it would associate each revenue amount to the associated price in order (since there is the same amount of keys as there are values) – ploo Sep 30 '14 at 19:16
  • since you are using revenue[price] = row[1] within a for loop you will be assigning whatever element is at row[1] to the dictionary element with key price. Also correct me If I'm wrong but shouldn't values in csv files be seperated by commas. – acquayefrank Sep 30 '14 at 19:30
0

KeyError is thrown when you try to get some value by the key that doesn't exist. In your case when constructing the revenue dict in for loop you assign each new revenue by the same key which is the prices tuple .

revenue[prices] = row[1]

Then you try to get value by the key that doesn't exist. To fix this you want to assign revenue to corresponding element of prices tuple in the for loop.

i = 0
for row in file_reader:
    revenue[prices[i]] = row[1] #assign revenue at each price-point
    i += 1

But make sure that length of tuple is not less than number of records in file!

Also, there is a shorter way to construct the prices tuple:

prices = tuple(range(1,21))
g.borisov
  • 11
  • 2