"TypeError: string indices must be integers",当使用Pandas Datareader从Yahoo Finance获取股票数据时
import pandas_datareader
end = "2022-12-15"
start = "2022-12-15"
stock_list = ["TATAELXSI.NS"]
data = pandas_datareader.get_data_yahoo(symbols=stock_list, start=start, end=end)
print(data)
当我运行这段代码时,我得到的错误是"TypeError: string indices must be integers"
。
编辑:我已经更新了代码,并将列表作为符号参数传递,但仍然显示同样的错误。
错误:
Traceback (most recent call last):
File "C:\Users\Deepak Shetter\PycharmProjects\100DAYSOFPYTHON\mp3downloader.py", line 7, in <module>
data = pandas_datareader.get_data_yahoo(symbols=[TATAELXSI], start=start, end=end)
File "C:\Users\Deepak Shetter\PycharmProjects\100DAYSOFPYTHON\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo
return YahooDailyReader(*args, **kwargs).read()
File "C:\Users\Deepak Shetter\PycharmProjects\100DAYSOFPYTHON\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read
df = self._dl_mult_symbols(self.symbols)
File "C:\Users\Deepak Shetter\PycharmProjects\100DAYSOFPYTHON\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols
stocks[sym] = self._read_one_data(self.url, self._get_params(sym))
File "C:\Users\Deepak Shetter\PycharmProjects\100DAYSOFPYTHON\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 153, in _read_one_data
data = j["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
TypeError: string indices must be integers
到目前为止,这里报告的解决方案没有一个对我有用。根据讨论这里,雅虎对他们的API进行了修改,破坏了与以前pandas datareader版本的兼容性。
在同一个Github线程中,报告了一个修正,在Github用户raphi6的拉动请求中实现。我确认该拉动请求工作正常。拉动请求中的版本可以用这3行来安装。
conda install pycryptodome pycryptodomex
conda uninstall pandas-datareader
pip install git+https://github.com/raphi6/pandas-datareader.git@ea66d6b981554f9d0262038aef2106dda7138316
这些pycrypto*
包是我必须安装的依赖项,以使其工作。注意,我在这里使用的是提交的哈希值,而不是分支的名称,因为它是Yahoo!_Issue#952
,而且这样使用pip时,哈希值的字符有一个问题。
这也可以用pip代替conda的所有命令来完成(见下面的更新1)。
第1次更新
要在Google Colab上尝试这个方法,请使用(如图所示,这里)。
! pip install pycryptodome pycryptodomex
! pip uninstall --yes pandas-datareader
! pip install git+https://github.com/raphi6/pandas-datareader.git@ea66d6b981554f9d0262038aef2106dda7138316
更新2 (27/12/2022)
虽然上个星期我无法使它工作,但我又试了一次,我可以确认Nikhil Mulley下面提到的pdr_override()
解决方法现在可以工作了(至少在yfinance 0.2.3和pandas-datareader 0.10.0的情况下是这样)。
原来的答案(能用,但代码行数更多)
在同一个Github线程中,报告了一个修正,在Github用户raphi6的拉动请求中实现。我确认该拉动请求工作正常。拉动请求的详细安装说明可以找到这里,为了完整起见,复制到下面。
git clone https://github.com/raphi6/pandas-datareader.git
cd pandas-datareader
conda uninstall pandas-datareader
conda install pycryptodome pycryptodomex
git checkout 'Yahoo!_Issue#952'
python setup.py install --record installed_files.txt
安装命令中的--record
参数是为了得到一个已安装文件的列表,以便将来容易卸载(按照这个SO主题)。pycrypto*
文件是我必须安装的依赖项,以使其工作。
! git clone https://github.com/raphi6/pandas-datareader.git && cd pandas-datareader && git checkout 'Yahoo!_Issue#952' && pip uninstall --yes pandas-datareader && yes | pip install pycryptodome pycryptodomex && python setup.py install --record installed_files.txt
安装它。命令在google colab中似乎工作正常,但随后import pandas_datareader
却出现了模块未找到的错误。有什么线索吗?
- joanlofe 2022-12-20
setup.py
,更优雅地解决了问题,对Google Colab也更友好。
- joanlofe 2022-12-20
这不是答案,但我认为问题与从雅虎本身获取的 pdr 数据阅读器有关
>>> import pandas_datareader as dtr
>>> from datetime import datetime
>>> initial_portfolio=['AAPL', 'MA', 'F', 'MSFT', '^GSPC']
>>> startdate = datetime(2022,12,1)
>>> enddate=datetime(2022,12,10)
>>> stock_data=dtr.yahoo.daily.YahooDailyReader(initial_portfolio,start=startdate,end=enddate).read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lib/python3.9/site-packages/pandas_datareader/base.py", line 258, in read
df = self._dl_mult_symbols(self.symbols)
File "lib/python3.9/site-packages/pandas_datareader/base.py", line 268, in _dl_mult_symbols
stocks[sym] = self._read_one_data(self.url, self._get_params(sym))
File "lib/python3.9/site-packages/pandas_datareader/yahoo/daily.py", line 153, in _read_one_data
data = j["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
TypeError: string indices must be integers
短期解决方案可能是使用 yfinance override 并在此期间查看是否有帮助,直到 yahoo finance 恢复其数据功能?
Python 3.9.1 (default, Dec 28 2020, 11:22:14)
[Clang 11.0.0 (clang-1100.0.33.17)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pandas_datareader import data as pdr
>>> import yfinance as yf
>>> yf.pdr_override()
>>> y_symbols = ['SCHAND.NS', 'TATAPOWER.NS', 'ITC.NS']
>>> from datetime import datetime
>>> startdate = datetime(2022,12,1)
>>> enddate = datetime(2022,12,15)
>>> data = pdr.get_data_yahoo(y_symbols, start=startdate, end=enddate)
[*********************100%***********************] 3 of 3 completed
>>> data
Adj Close Close ... Open Volume
ITC.NS SCHAND.NS TATAPOWER.NS ITC.NS SCHAND.NS TATAPOWER.NS ... ITC.NS SCHAND.NS TATAPOWER.NS ITC.NS SCHAND.NS TATAPOWER.NS
Date ...
2022-12-01 339.549988 195.949997 224.850006 339.549988 195.949997 224.850006 ... 341.700012 191.600006 225.250000 16630417 544485 7833074
2022-12-02 337.149994 196.600006 225.250000 337.149994 196.600006 225.250000 ... 339.350006 196.000000 225.449997 8388835 122126 7223274
2022-12-05 336.750000 191.050003 224.199997 336.750000 191.050003 224.199997 ... 337.649994 200.850006 225.250000 9716390 107294 10750610
2022-12-06 337.299988 196.399994 228.800003 337.299988 196.399994 228.800003 ... 334.100006 191.000000 224.199997 6327430 102911 20071039
2022-12-07 340.100006 187.350006 225.850006 340.100006 187.350006 225.850006 ... 338.500000 198.000000 228.800003 9813208 122772 7548312
2022-12-08 338.399994 181.850006 225.050003 338.399994 181.850006 225.050003 ... 340.200012 186.000000 226.000000 6200447 114147 7507975
2022-12-09 341.399994 176.899994 219.399994 341.399994 176.899994 219.399994 ... 339.750000 183.899994 225.899994 8132228 179660 13087278
2022-12-12 343.200012 177.350006 217.699997 343.200012 177.350006 217.699997 ... 341.000000 177.750000 219.750000 11214662 133507 8858525
2022-12-13 345.600006 178.449997 218.850006 345.600006 178.449997 218.850006 ... 344.500000 179.350006 218.800003 10693426 74873 7265105
2022-12-14 345.399994 179.149994 222.699997 345.399994 179.149994 222.699997 ... 346.000000 180.449997 219.800003 7379878 32085 9179593
[10 rows x 18 columns]
>>>
用yahoo finance代替,它对我来说很有效。
import datetime as dt
import yfinance as yf
company = 'TATAELXSI.NS'
# Define a start date and End Date
start = dt.datetime(2020,1,1)
end = dt.datetime(2022,1,1)
# Read Stock Price Data
data = yf.download(company, start , end)
data.tail(10)
更新yfinance,它对我来说是有效的。