基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
The majority of big data analytics applied to transportation datasets suffer from being too domain-specific,that is,they draw conclusions for a dataset based on analytics on the same dataset.This makes models trained from one domain(e.g.taxi data)applies badly to a different domain(e.g.Uber data).To achieve accurate analyses on a new domain,substantial amounts of data must be available,which limits practical applications.To remedy this,we propose to use semi-supervised and active learning of big data to accomplish the domain adaptation task:Selectively choosing a small amount of datapoints from a new domain while achieving comparable performances to using all the datapoints.We choose the New York City(NYC)transportation data of taxi and Uber as our dataset,simulating different domains with 90%as the source data domain for training and the remaining 10%as the target data domain for evaluation.We propose semi-supervised and active learning strategies and apply it to the source domain for selecting datapoints.Experimental results show that our adaptation achieves a comparable performance of using all datapoints while using only a fraction of them,substantially reducing the amount of data required.Our approach has two major advantages:It can make accurate analytics and predictions when big datasets are not available,and even if big datasets are available,our approach chooses the most informative datapoints out of the dataset,making the process much more efficient without having to process huge amounts of data.
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Analyzing Cross-domain Transportation Big Data of New York City with Semi-supervised and Active Learning
来源期刊 计算机、材料和连续体(英文) 学科 工学
关键词 Big data TAXI and uber domain ADAPTATION active LEARNING SEMI-SUPERVISED LEARNING
年,卷(期) 2018,(10) 所属期刊栏目
研究方向 页码范围 1-9
页数 9页 分类号 TP3
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2018(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
Big
data
TAXI
and
uber
domain
ADAPTATION
active
LEARNING
SEMI-SUPERVISED
LEARNING
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
计算机、材料和连续体(英文)
月刊
1546-2218
江苏省南京市浦口区东大路2号东大科技园A
出版文献量(篇)
346
总下载数(次)
4
总被引数(次)
0
论文1v1指导