《用redis解决问题》word版.docx
用redis解决问题1. 架构Data structure server .1.1. Data type http:/redis.io/topics/data-types-introList 小于512成员, ziplist; >, double linked list .Hash (2.6 Zipmap ? )链地址法,hash tableSet Hash setSorted set,使用 skip list1.2. 性能1.2.1. 哈希算法Redis 目前使用两种不同的哈希算法:1. MurmurHash2 32 bit 算法:这种算法的分布率和速度都非常好,具体信息请参考MurmurHash的主页:。dictAdd 在每次向字典添加新键值对之前,都会对哈希表ht0 进行检查,对于ht0 的size 和used 属性,如果它们之间的比率ratio = used / size 满足以下任何一个条件的话,rehash 过程就会被激活:1. 自然rehash :ratio >= 1 ,且变量dict_can_resize 为真。2. 强制rehash : ratio 大于变量dict_force_resize_ratio (目前版本中,dict_force_resize_ratio 的值为51.2.2. Hash table的性能操作函数算法复杂度创建一个新字典dictCreate O(1)添加新键值对到字典dictAdd O(1)添加或更新给定键的值dictReplace O(1)在字典中查找给定键所在的节点dictFind O(1)在字典中查找给定键的值dictFetchValue O(1)从字典中随机返回一个节点dictGetRandomKey O(N)根据给定键,删除字典中的键值对dictDelete O(1)清空并释放字典dictRelease O(N)清空并重置(但不释放)字典dictEmpty O(N)缩小字典dictResize O(N)扩大字典dictExpand O(N)对字典进行给定步数的rehash dictRehash O(N)在给定毫秒内,对字典进行rehash dictRehashMilliseconds O(N)1.2.3. List 的性能Pop, push , O(1)1.2.4. Skip list 性能以下是操作这两个数据结构的API ,它们的作用以及相应的算法复杂度:函数作用复杂度zslCreateNode 创建并返回一个新的跳跃表节点最坏O(1)zslFreeNode 释放给定的跳跃表节点最坏O(1)zslCreate 创建并初始化一个新的跳跃表最坏O(N)zslFree 释放给定的跳跃表最坏O(N)zslInsert 将一个包含给定score 和member的新节点添加到跳跃表中最坏O(N) 平均O(logN)zslDeleteNode 删除给定的跳跃表节点最坏O(N)zslDelete 删除匹配给定member 和score 的元素最坏O(N) 平均O(logN)zslFirstInRange 找到跳跃表中第一个符合给定范围的元素最坏O(N) 平均O(logN)zslLastInRange 找到跳跃表中最后一个符合给定范围的元素最坏O(N) 平均O(logN)zslDeleteRangeByScore 删除score 值在给定范围内的所有节点最坏O(N2)zslDeleteRangeByRank 删除给定排序范围内的所有节点最坏O(N2)zslGetRank 返回目标元素在有序集中的排位最坏O(N) 平均O(logN)zslGetElementByRank 根据给定排位,返回该排位上的元素节点最坏O(N) 平均O(logN)1.3. 优化1.3.1. Presharding 1.3.2. config1.3.3. monitor2. 场景2.1. Good use case 2.1.1. list l most recent id_list = redis.lrange("ments",start,start+num_items-1)l queueRedis lists have an blpop and brpop command which returns and removes the first (or last) element from the list orblocks until one is available. These can be used to power a simple queue.2.1.2. sorted set l top score userZADD leaderboard <score> <username>To get the top 100 users by score is as easy as ZREVRANGE leaderboard 0 99. Similarly to tell the user its global rank you just do ZRANK leaderboard <username>. l Expired users· Every time a new item is added to our (non Redis) database we add it into the sorted set. As score we use the time at which this item should expire, in other words the current_time+time_to_live.· There is a background worker doing queries in the sorted set using for instance ZRANGE . WITHSCORES to take the latest 10 items. If there are scores representing unix times already in the past, we delete this items from the database.2.1.3. Mix list & sorted set· Every time a new news is posted we add the ID into a list, with LPUSH + LTRIM in order to take only the latest 1000 items.· There is a worker that gets this list and continually computes the final score of every news in this set of 1000 news. The result is used to populate a sorted set with ZADD. Old news are removed from the sorted set in the mean time as a cleanup operation.2.1.4. SetEvery time I get a new pageview I just do the following:SADD page:day1:<page_id> <user_id>Of course instead of day1 you may want to use the first second of today, as unix time, like: time()-(time()%3600*24), or something like that. Want to know the number of unique users? Just do SCARD page:day1:<page_id>. Need to test if a specific user already accessed that page? Just do SISMEMBER page:day1:<page_id>2.1.5. counter· 记录page viewINCR user:<id>EXPIRE user:<id> 60· Get unique id· In order to make this algorithm binary safe (they are just tags but think to utf8, spaces and so forth) we start performing the SHA1 digest of the tag. SHA1(redis) = b840fc02d524045429941cc15f59e41cb7be6c52.· Let's check if this tag is already associated with a unique ID with the command GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id.· If the above GET returns an ID, return it back to the user. We already have the unique ID.· Otherwise. create a new unique ID with (assume it returned 123456).· Finally associate this new ID to our tag with SETNX tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456. By using SETNX if a different client was faster than this one the key will not be setted. Not only, SETNX returns 1 if the key is set, 0 otherwise. So. let's add a final step to our computation.· If SETNX returned 1 (We set the key) return 123456 to the caller, it's our tag ID, otherwise perform GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id and return the value to the caller.2.1.6. Caching2.1.7. Sortsort watch:leto by bug:*->priority get bug:*->details2.1.8. Luascript = <<-eoslocal friend_names = redis.call(smembers, KEYS1)local friends = for i = 1, #friend_names dolocal friend_key = user: . friend_namesilocal gender = redis.call(hget, friend_key, gender)if gender = ARGV1 thentable.insert(friends, redis.call(hget, friend_key, details)endendreturn friendseosRedis.new.eval(script, friends:leto, m)23The above code gets the details for all of Letos male friends. Notice that to call Redis commands within our script weuse the redis.call(”command”, ARG1, ARG2, .) method.2.1.9. Pub/sub2.2. Antifor instance a Redis data set can't be bigger than available memory, so if you have some big data application and a mostly-reads access pattern, Redis is not the right pick. 2.2.1. 不要用 “keys”you mightbe tempted (as I was!) to use the keys command:keys bug:1233:*The better solution is to use a hash. Much like we can use hashes to provide a way to expose secondary indexes, sotoo can we use them to organize our data:hset bugs:1233 1 ”id:1, account: 1233, subject: .”hset bugs:1233 2 ”id:2, account: 1233, subject: .”3. 应用3.1. 爬虫调度可以忍受偶尔的丢失。定期重建。3.2. 实时转化率与价格的关系分析可以。4. Faq4.1. 是threadsafe 的吗?Thread Safe.4.2. Redis Atomic counter 与cassandra的异同? Integer 是32位的吗?是int_32_t .4.3. Redis 会不会丢失数据?会。We mentioned before that Redis is an in-memory persistent store. With respect to persistence, by default, Redissnapshots the database to disk based on how many keys have changed. You configure it so that if X number of keyschange, then save the database every Y seconds. By default, Redis will save the database every 60 seconds if 1000 ormore keys have changed all the way to 15 minutes if 9 or less keys has changed.Alternatively (or in addition to snapshotting), Redis can run in append mode. Any time a key changes, an append-onlyfile is updated on disk. In some cases its acceptable to lose 60 seconds worth of data, in exchange for performance,should there be some hardware or software failure. In some cases such a loss is not acceptable. Redis gives you theoption. In chapter 6 well see a third option, which is offloading persistence to a slave4.4. Redis 用zookeeper 实现fail over? 主从切换需要多长时间?ZookeeperKeepalived 4.5. Pub/sub ,适用的场合,与active mq 相比有什么局限?或好处? Rpoplpush redis queue ? 会丢失。 性能更好。4.6. Redis cluster 是否值得等待?是 Fail over Auto sharding 4.7. key的数量到多少,性能是有保证的?32 bit , 4g Ratio (used/size), 0.7 .4.8. 能否预设size, 以避免re size?我们知道哈希表最初的大小是由 DICT_HT_INITIAL_SIZE 决定的。4.9. Redis 内存管理,能否支持96G ram? 可以。 最好用多个instance .用maxmemory 限制大小4.10. mostly-reads access pattern, Redis is not the right pick. Why ? 我觉得也可以,否则自己实现。4.11. AOF 与RDB ,如何选择?