当前位置: 首页 > news >正文

福清做网站的公司百度收录网站

福清做网站的公司,百度收录网站,网站如何做会员登录页面,免费进入正能量的网站使用DTW算法简单实现曲线的相似度计算-CSDN博客 前文中股票高相关k线筛选问题的延伸。基于github上的代码迁移应用到股票高相关预测上。 这里给出一个相关完整的代码实现案例。 1、数据准备 假设你已经有了一些历史股票的k线数据。如果数据能打标哪些股票趋势是上涨的、下跌…

使用DTW算法简单实现曲线的相似度计算-CSDN博客

前文中股票高相关k线筛选问题的延伸。基于github上的代码迁移应用到股票高相关预测上。

这里给出一个相关完整的代码实现案例。

1、数据准备

假设你已经有了一些历史股票的k线数据。如果数据能打标哪些股票趋势是上涨的、下跌的会更好。假设这是你目前正在研究的股票k线图:

其他支股票的k-线图如下:

plt.figure(figsize=(11,7))
colors = ['#D62728','#2C9F2C','#FD7F23','#1F77B4','#9467BD','#8C564A','#7F7F7F','#1FBECF','#E377C2','#BCBD27']for i, r in enumerate([0,27,65,100,145,172]):plt.subplot(3,2,i+1)plt.plot(x_train[r][:100], label=labels[y_train[r]], color=colors[i], linewidth=2)plt.xlabel('time sequece')plt.legend(loc='upper left')plt.tight_layout()

接下来要使用基于dtw距离计算的knn近邻算法来找出与目标股票ta高相关的Top10支股票,并将他们的k-线图与ta股票的k-线图进行可视化对比呈现。

2、训练KnnDtw算法模型

import sys
import collections
import itertools
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import mode
from scipy.spatial.distance import squareformplt.style.use('bmh')
%matplotlib inlinetry:from IPython.display import clear_outputhave_ipython = True
except ImportError:have_ipython = Falseclass KnnDtw(object):"""K-nearest neighbor classifier using dynamic time warpingas the distance measure between pairs of time series arraysArguments---------n_neighbors : int, optional (default = 5)Number of neighbors to use by default for KNNmax_warping_window : int, optional (default = infinity)Maximum warping window allowed by the DTW dynamicprogramming functionsubsample_step : int, optional (default = 1)Step size for the timeseries array. By setting subsample_step = 2,the timeseries length will be reduced by 50% because every seconditem is skipped. Implemented by x[:, ::subsample_step]"""def __init__(self, n_neighbors=5, max_warping_window=10000, subsample_step=1):self.n_neighbors = n_neighborsself.max_warping_window = max_warping_windowself.subsample_step = subsample_stepdef fit(self, x, l):"""Fit the model using x as training data and l as class labelsArguments---------x : array of shape [n_samples, n_timepoints]Training data set for input into KNN classiferl : array of shape [n_samples]Training labels for input into KNN classifier"""self.x = xself.l = ldef _dtw_distance(self, ts_a, ts_b, d = lambda x,y: abs(x-y)):"""Returns the DTW similarity distance between two 2-Dtimeseries numpy arrays.Arguments---------ts_a, ts_b : array of shape [n_samples, n_timepoints]Two arrays containing n_samples of timeseries datawhose DTW distance between each sample of A and Bwill be comparedd : DistanceMetric object (default = abs(x-y))the distance measure used for A_i - B_j in theDTW dynamic programming functionReturns-------DTW distance between A and B"""# Create cost matrix via broadcasting with large intts_a, ts_b = np.array(ts_a), np.array(ts_b)M, N = len(ts_a), len(ts_b)cost = sys.maxsize * np.ones((M, N))# Initialize the first row and columncost[0, 0] = d(ts_a[0], ts_b[0])for i in np.arange(1, M):cost[i, 0] = cost[i-1, 0] + d(ts_a[i], ts_b[0])for j in np.arange(1, N):cost[0, j] = cost[0, j-1] + d(ts_a[0], ts_b[j])# Populate rest of cost matrix within windowfor i in np.arange(1, M):for j in np.arange(max(1, i - self.max_warping_window),min(N, i + self.max_warping_window)):choices = cost[i - 1, j - 1], cost[i, j-1], cost[i-1, j]cost[i, j] = min(choices) + d(ts_a[i], ts_b[j])# Return DTW distance given window return cost[-1, -1]def _dist_matrix(self, x, y):"""Computes the M x N distance matrix between the trainingdataset and testing dataset (y) using the DTW distance measureArguments---------x : array of shape [n_samples, n_timepoints]y : array of shape [n_samples, n_timepoints]Returns-------Distance matrix between each item of x and y withshape [training_n_samples, testing_n_samples]"""# Compute the distance matrix        dm_count = 0# Compute condensed distance matrix (upper triangle) of pairwise dtw distances# when x and y are the same arrayif(np.array_equal(x, y)):x_s = np.shape(x)dm = np.zeros((x_s[0] * (x_s[0] - 1)) // 2, dtype=np.double)p = ProgressBar(shape(dm)[0])for i in np.arange(0, x_s[0] - 1):for j in np.arange(i + 1, x_s[0]):dm[dm_count] = self._dtw_distance(x[i, ::self.subsample_step],y[j, ::self.subsample_step])dm_count += 1p.animate(dm_count)# Convert to squareformdm = squareform(dm)return dm# Compute full distance matrix of dtw distnces between x and yelse:x_s = np.shape(x)y_s = np.shape(y)dm = np.zeros((x_s[0], y_s[0])) dm_size = x_s[0]*y_s[0]p = ProgressBar(dm_size)for i in np.arange(0, x_s[0]):for j in np.arange(0, y_s[0]):dm[i, j] = self._dtw_distance(x[i, ::self.subsample_step],y[j, ::self.subsample_step])# Update progress bardm_count += 1p.animate(dm_count)return dmdef predict(self, x):"""Predict the class labels or probability estimates for the provided dataArguments---------x : array of shape [n_samples, n_timepoints]Array containing the testing data set to be classifiedReturns-------2 arrays representing:(1) the predicted class labels (2) the knn label count probability"""dm = self._dist_matrix(x, self.x)# Identify the k nearest neighborsknn_idx = dm.argsort()[:, :self.n_neighbors]# Identify k nearest labelsknn_labels = self.l[knn_idx]# Model Labelmode_data = mode(knn_labels, axis=1)mode_label = mode_data[0]mode_proba = mode_data[1]/self.n_neighborsreturn mode_label.ravel(), mode_proba.ravel()class ProgressBar:"""This progress bar was taken from PYMC"""def __init__(self, iterations):self.iterations = iterationsself.prog_bar = '[]'self.fill_char = '*'self.width = 40self.__update_amount(0)if have_ipython:self.animate = self.animate_ipythonelse:self.animate = self.animate_noipythondef animate_ipython(self, iter):
#         print('\r', self,)
#         sys.stdout.flush()self.update_iteration(iter + 1)def update_iteration(self, elapsed_iter):self.__update_amount((elapsed_iter / float(self.iterations)) * 100.0)self.prog_bar += '  %d of %s complete' % (elapsed_iter, self.iterations)def __update_amount(self, new_amount):percent_done = int(round((new_amount / 100.0) * 100.0))all_full = self.width - 2num_hashes = int(round((percent_done / 100.0) * all_full))self.prog_bar = '[' + self.fill_char * num_hashes + ' ' * (all_full - num_hashes) + ']'pct_place = (len(self.prog_bar) // 2) - len(str(percent_done))pct_string = '%d%%' % percent_doneself.prog_bar = self.prog_bar[0:pct_place] + \(pct_string + self.prog_bar[pct_place + len(pct_string):])def __str__(self):return str(self.prog_bar)

模型训练、预测

m = KnnDtw(n_neighbors=1, max_warping_window=10)
m.fit(x_train[::10], y_train[::10]) # 做数据采样,每10个元素采样
label, proba = m.predict(x_test[::10])

模型评估

from sklearn.metrics import classification_report, confusion_matrix
print(classification_report(label, y_test[::10],target_names=[l for l in labels.values()]))conf_mat = confusion_matrix(label, y_test[::10])fig = plt.figure(figsize=(6,6))
width = np.shape(conf_mat)[1]
height = np.shape(conf_mat)[0]res = plt.imshow(np.array(conf_mat), cmap=plt.cm.summer, interpolation='nearest')
for i, row in enumerate(conf_mat):for j, c in enumerate(row):if c>0:plt.text(j-.2, i+.1, c, fontsize=16)
cb = fig.colorbar(res)
plt.title('Confusion Matrix')
_ = plt.xticks(range(6), [l for l in labels.values()], rotation=90)
_ = plt.yticks(range(6), [l for l in labels.values()])

3、为目标股票ta筛选Top10高相似的股票

3.1 计算股票的dtw距离,并筛选出Top10高相关股票
m._dtw_distance(x_train[1], x_train[1112])x_similaritys = {}
# 选定一支目标股票
ta = x_test[0] 
# 分别计算其他200支股票与目标股票的相关性系数
for stock_id, stock_data in enumerate(x_test[1:200]):dtw_dist = m._dtw_distance(ta, stock_data)if stock_id not in x_similaritys.keys() and stock_id!=0:x_similaritys[stock_id] = dtw_dist#选出与目标股票Top10相似的股票k线
res = sorted(x_similaritys.items(), key=lambda x: x[1])
print("与目标股票Ta高相关的10支股票为:")
for stock_id,dtw_dist in res[:10]:print("股票id:",stock_id," , DTW距离: ",dtw_dist)

3.2 可视化Top10高相似股票曲线
# 绘制TopN股票趋势曲线
plt.figure(figsize=(11, 11))
for i, (stock_id,dtw_dist) in enumerate(res[:10]):
#     print(i, stock_id,dtw_dist)plt.subplot(5,2,i+1)plt.plot(x_test[0][:100], label="stock-0", color=colors[0], linewidth=2)plt.plot(x_test[stock_id][:100], label="stock-%d"%stock_id, color=(0.2/(i+1),0.1/(i+1),0.7-0.2/(i+1)-0.1/(i+1)), linewidth=2)plt.xlabel('time')plt.legend(loc='upper left')plt.tight_layout()

红色的是目标股票Ta的k-线图,与之高相似的10支股票的k-线图为蓝色曲线,分别呈现在10个子图中,做为对比可视化,能更直观的对两支股票的走势作出差异对比。方便交付给投资者对比查看股票走势。

Done

http://www.hrbkazy.com/news/13881.html

相关文章:

  • 专门做效果图的网站企业推广网
  • 免费个人网站域名注册线下宣传渠道和宣传方式
  • 如何在百度上推广自己福州seo网站排名
  • 建网站的公司公司成都网站建设创新互联
  • 常州网络公司鼎豪网络网站建设站长之家网站排名
  • 做电影网站赚钱知乎郑州网站建设推广优化
  • 株洲做网站优化新手网络推广怎么干
  • 服务号与wordpressseo排名优化软件有
  • 网站建设报价免费网站建站
  • 做网站需要买空间么 服务器网络搜索词排名
  • 网络设计课程seo的方式包括
  • 做网站在自己电脑建立虚拟机网络推广优化服务
  • 大学生职业生涯规划pptwindows优化大师
  • 我的世界服务器网站怎么做seo目标关键词优化
  • 特乐网站建设网络优化包括
  • 搜索公众号seo搜索引擎优化策略
  • 自己做商业网站百度竞价关键词价格查询工具
  • app客户端网站建设方案dw网页设计模板网站
  • icp备案网站负责人在线生成html网页
  • 新加坡网站大全网站建设公司地址在哪
  • 忻州网站建设公司国内搜索引擎有哪些
  • 博山专业网站优化哪家好网上推广产品哪个网好
  • 青岛哪家做网站的公司网站流量查询站长之家
  • 深圳网站制作的公司有哪些中国免费广告网
  • 马鞍山网站seo网站管理与维护
  • php网站建设到护卫神中国网评中国网评
  • 腾讯云学生怎么做网站的品牌推广计划书怎么写
  • 深圳英文网站设计网络销售怎么做才能有业务
  • 韩国有哪些做潮牌的网站亚马逊关键词
  • 建设集团网站方案黄页网络的推广网站有哪些