面向频繁模式挖掘的差分隐私保护研究综述.pdf





《面向频繁模式挖掘的差分隐私保护研究综述.pdf》由会员分享,可在线阅读,更多相关《面向频繁模式挖掘的差分隐私保护研究综述.pdf(10页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、2014?10?Journal on Communications October 2014?35?10?Vol.35 No.10?1?1,2(1.?100190?2.?100190)?3?TP309.2;TP392?A?1000-436X(2014)10-0200-10 Survey of differential privacy in frequent pattern mining DING Li-ping1,LU Guo-qing1,2(1.National Engineering Research Center of Fundamental Software,Institute of
2、Software,Chinese Academy of Sciences,Beijing 100190,China;2.University of Chinese Academy of Sciences,Beijing 100190,China)Abstract:Frequent pattern mining is an exploratory problem in the field of data mining.However,directly releasing the discovered frequent patterns and the corresponding true sup
3、ports may reveal the individuals privacy.The state-of-the-art solution for this problem is differential privacy,which offers a strong degree of privacy protection by adding noise.Firstly,the theoretical basis of differential privacy was introduced.Then,three representative frequent pattern mining me
4、thods under differential privacy were summarized and compared in detail.Finally,some future research directions were discussed.Key words:differential privacy;privacy protection;frequent pattern;data mining 1?(FPM,frequent pattern mining)1?(?)?k-?2?1)?2)?(DP,differential privacy)?Dwork?2006?2?2?3?1)?
5、(?)?2)?2?4?1)?2014-03-03?2014-04-20?(2012ZX01039-004)?(XDA06010600)Foundation Items:The National Science and Technology Major Program of China(2012ZX01039-004);The Strategic Technology Pilot Program of the Chinese Academy of Sciences(XDA06010600)doi:10.3969/j.issn.1000-436x.2014.10.023?10?201?2)?(FI
6、M,frequent itemset mining)?(FSM,frequent sequence mining)?(FGM,frequent subgraph mining)3?3?2?2.1?2,58?1 (-?2)?2?1D?2D?K?Range()K?K?K?-?Range()SK?12Pr()exp()Pr()K DSK DS?(1)?PrEs?Es?K?0.01,0.1,ln2,ln3?2.2?(global sensitivity)?2 (?9)?:f dDR,f?12,121max()()D Dff Df D=(2)?1D?2D?d?f?R?f?D?(Laplace mecha
7、nism)9?(exponential mechanism)10?11?12?13?14?1 (?9)?:f dDR?K?(3)?K?-?1()()(/),(/)dK Df DLapfLapf=+?(3)?(/)(1)iLapfid?f?f?(3)?()K D?(1)iid?2E(/)iferrorLapf=(4)?3 (?10)?:u()DOR?K?(5)?K?-?(,)(,)|Prexp()2u D rK D urrOu=(5)?u?(,)u D r?(,)()u D r rO?r?O?(5)?2.3?2?15?(sequential composition)?(parallel comp
8、osition)?1 (?15)?D?iK?(1)in?-i?iK?D?-i?2 (?15)?D?iK?(1)in?-i?iK?D?max()-i?2?1)?-?1?1?2?-?2)?-?202?35?1?2?2.4?2?1)?(,)userfullness 16?4 (,)userfullness 16)?K?Q?D?()DK D=?(6)?K?(,)userfullness?Pr|()()|1Q DQ Da?(6)2)?F-measure?top-k?F1-score?3?4?2?1)?2)?2?1?3?3.1?17?Apriori17?FP-growth18?TF19?FIM?(trun
9、cated frequency)?2?top-k?TF?2?1)?l?C?top-k?2)?k?2?(?PT-Exp)?(?PT-Lap)?1?2?1?TF?2?k?(2/)Lapkn?1?10?203?1?C?top-k?C?|lCI?|I?C?TF?C?p?ma()x(),kpf pff=?kf?C?k?C?01,=,kkSffCSff?(7)?C?0S?1S?1S?(1|1S=)?0S?(0|S?0|SC?)?C?0|1|CS+?TF?2?1 TF?2?PT-Exp?PT-Lap?1?Pr exp()4npf pk?C?k?top-k?C?)(4/Lapkn?top-k?2?k?)(2/
10、Lapkn?k?)(2/Lapkn?2(c)?TF?PT-Exp?l=2?top-3?TF?C?ln(|)4|/Iknl?k?l?0kf?()f p=max(),)kf pf?2 TF?PT-Exp?TF?PrivBasis20?-?top-k?1)?D?F?D?F?2)?F?-?P?F?P?-?B?3)?B?C B?C B?PrivBasis?-?B?(maximal clique)?F?P?B?F?P?,)G F P?,)G F P?-?-?-?B?3?-?B?D?-?341,I I I?B?top-k?l=2?top-3?13,I I?14,I I?34,I I?3 -?TF?top-k
11、?1)k?2)?21?DP-topkP?22?SmartTrunc?(?)?23?DiffPart?204?35?1)?2)?3)?D%?4?4?1,2,7,8T TTT?1v?4?DiffPart?1v?14vN=+1(1/)vLap?1v?1v?DiffPart?iv?)(ic v?22C ()/ih v?2C?()ih v?iv?iv?iv?)(ic v?12/(/2)C+?121,CC?DiffPart?/2?/2?4 DiffPart?2?TF?k?l?2?(|)O kD PrivBasis-?TF?(|)O w D DP-topkP?k?l?2?(|)O kD SmartTrunc
12、?(|)DOL DiffPart?2?(|)DOL DiffMR?MapReduce?k?l?(|)O kD?10?205?1111211111+1+1+2222mmiinnnnn=?(8)?m?in?i?1(,1)iimnnn+=?21924?3.2?25?(?)?(?Web?)?(?DNA?)?FSM?FIM?26?DiffPart23?(prefix tree)?STM-Full?1)?PT?2)?D%?1)?p?,ivp 1|)|tr(tr()|iivv+?tr()iv?iv?iv?1iv+?2)tr(|),|vv?children()|tr()|uvu?STM-basic?STM-F
13、ull?(?)?0?STM-Full?k?m?k?,()B m p?p=exp()/2?exp0,0()=1(),0 xp xxx?(9)?2 2/=?2?STM-Full?STM-Full?27?n-gram?N-gram?1)?D?maxgramn(maxn?n-gram?)?T?2)?T?n-gram?3)?(Markov assumption)?n-gram?n-gram?T?D%?3)?T?D%?1)?T?T?nmax?(?N-gram-uniform)?5?3?max5n=?2/5?N-gram?v?(10)?maxP?v?()c v?v?maxmaxmin(log),()vPni
14、c v=(10)?206?35?5?/5?1v?1max100.4314 109vPh=+?/54/51v=?N-gram?maxn-gram?T?3?1)?maxn?2)?3)?5?T?max=5=3n?1v?2v?3v?2830?(,)userfullness?STM-Full26?PT?1v?lo(g1/1/)iO?1v?210log1(1/)niiO=?i?1v?29?PT-Sample29?6?1)?PT?2)?top-k?6 PT-Sample?PT-Sample?2?PT?top-k?top-k?326,27,29,31?3.3?32?5 N-gram?3?STM-Full?(|
15、)DOL Prefix-Hybrid?(|)DOL N-gram?N-gram?(|)DOL PT-Sample?1(|)maxhLO D+?10?207?,)(GV E=?,()sssGV E=?sVV?sEE?sG?G?2?111)(,VGE=?222)(,VGE=?2?1V?2V?1E?2E?1G?2G?2G?1G?NPC?33?TF19?PT-Exp?(MCMC)?top-k?Diff-FPM?1)MCMC?k?top-k?2)?k?Diff-FPM?MCMC?top-k?MCMC?()x?MH(metropolis-hastings)?1)?(POFG)?POFG?2)?()()11
16、()/2()()/expexp2x Xu xuxu xu=(11)?x?()u x?3)?Diff-FPM?xyq?x?y?()()11expmin,1ex()/2(p)/2yxxyxyyqxuuqauu=(12)?A-A-D?=1/()xyqN x?()N x?x?A-D?()=5()10N xN y=?exp(3/2)(1/10)min0.82exp(2/2)(1/5)xya=?A-A-D?82%?A-D?Diff-FPM?MCMC?()x?(,)?34?4?4.1?1)?SuLQ-based ID35?DiffP-C4.535?DiffGen36?(information gain)?2
17、)?(means)?(center)?(median)?Pk-means37,38?Pk-median?39?GUPT40?k?NP?k?2?k?208?35?4.2?PDBSCAN41,42?4.3?1)?30?2)?MapReduce?Diff-FPM24?MapReduce?Airavat43?NoSQL?5?1 HAN J,KAMBER M,PEI J.Data Mining:Concepts and Tech-niquesM.Morgan kaufmann,2006.2 DWORK C.Differential privacyA.Proc of the 33rd Internatio
18、nal Colloquium on Automata,Languages and ProgrammingC.Berlin:Spinger-Verlag,2006.3?,?,?.?J.?,2014,37(1):101:122.XIONG P,ZHU T Q,WANG X F.A survey on differential privacy and applicationsJ.Chinese Journal of Computers,2014,37(1):101-122.4?,?.?J.?,2014,37(4):927-949.ZHANG X J,MENG X F.Differential pri
19、vacy in data publication and analysisJ.Chinese Journal of Computers,2014,37(4):927-949.5 BLUM A,DWORK C,McSHERRY F,et al.Practical privacy:the SuLQ frameworkA.Proc of the 24th ACM SIGMOD International Conference on Management of Data/Principles of Database Sys-temsC.NewYork:ACM Press,2005.128-138.6
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 面向 频繁 模式 挖掘 隐私 保护 研究 综述

限制150内