《小红书x-s、x-s-common算法研究与分析.docx》由会员分享,可在线阅读,更多相关《小红书x-s、x-s-common算法研究与分析.docx(67页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、小红书x-s、x-s-common算法研究与分析(仅供学习)文章目录 1. 写在前面 2. 参数分析o 2.1. x-s、x-t、x-s-common1. 写在前面最近花时间分析了一下xhs,研究的不深,也参考了网上许多开源出来的案例。简单记录一下,感兴趣的将就看一下吧!之前也研究过一段时间的某音,下面接口aweme相关的都能够拿到,过一下防抓包,这个的话很多大佬有成熟的方案,lsp跟xp的模块xhs的话Web还是相对比较简单的!一样sns相关的接口基本都能拿到,那几个x系列加密参数解决了,就容易了真正的难点就是风控,各种限制的话确实很容易搞人心态的2. 参数分析2.1. x-s、x-t、x-
2、s-common先全局搜索,上面的x-s、x-t,然后断点断住以后就可以开始分析调试了,上面的三个参数在应对后续的请求缺一不可!a1是cookies里面的,它将参与x-s-common签名x-s-common必须带,之前网上一些公开的资源,都是没带的!不带的话接口请求能够成功,但是没有数据!继续断点分析可以看到x-s-common签名的生成,如下所示:上面x-s-common参数可以根据断点,分析出传的值,分析如下:common = s0: 5, # 固定值 s1: , # 固定值 x0: 1, # 固定值 x1: , # 版本 x2: Windows, # 固定值 x3: xhs-pc-we
3、b, # 固定值 x4: 3.15.9, # 固定值 x5: a1, # cookies里面的a1,用来做校验的 x6: x_t, x7: x_s, x8: b1, # window.localStorage(客户端浏览器信息标识) x9: mrc(x_t + x_s + b1), #x6,x7,x8签名产生 x10: 1, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15x8参数b1值获取示例:另外就是签名参数不对,算法没到位的话,接口基本请求都是无效的code: -1, success: False 1最终签名算法参数生成以后测试,接口数据正常响应:部分签名算法,自己
4、可以根据实际调式来补缺失的环境:import binasciiimport ctypesimport hashlibimport jsonimport randomimport reimport stringimport timeimport urllib.parseimport requestsdef sign(uri, data=None, ctime=None, a1=, b1=): def h(n): m = d = A4NjFqYu5wPHsO0XTdDgMa2r1ZQocVte9UJBvk6/7=yRnhISGKblCWi+LpfE8xzm3 for i in range(0, 3
5、2, 3): o = ord(ni) g = ord(ni + 1) if i + 1 32 else 0 h = ord(ni + 2) if i + 2 32 else 0 x = (o & 3) 4) p = (15 & g) 6) v = o 2 b = h & 63 if h else 64 if not g: p = b = 64 m += dv + dx + dp + db return m v = int(round(time.time() * 1000) if not ctime else ctime) raw_str = fvtesturijson.dumps(data,
6、separators=(, :), ensure_ascii=False) if isinstance(data, dict) else md5_str = hashlib.md5(raw_str.encode(utf-8).hexdigest() x_s = h(md5_str) x_t = str(v) common = s0: 5, s1: , x0: 1, x1: , x2: Windows, x3: xhs-pc-web, x4: , x5: a1, x6: x_t, x7: x_s, x8: b1, x9: mrc(x_t + x_s), x10: 1, encodeStr = e
7、ncodeUtf8(json.dumps(common, separators=(, :) x_s_common = b64Encode(encodeStr) return x-s: x_s, x-t: x_t, x-s-common: x_s_common, def get_a1_and_web_id(): def random_str(length): alphabet = string.ascii_letters + string.digits return .join(random.choice(alphabet) for _ in range(length) d = hex(int(
8、time.time() * 1000)2: + random_str(30) + 5 + 0 + 000 g = (d + str(binascii.crc32(str(d).encode(utf-8):52 return g, hashlib.md5(g.encode(utf-8).hexdigest()img_cdns = https:/sns-img-, https:/sns-img-, https:/sns-img-, https:/sns-img-,def get_img_url_by_trace_id(trace_id: str, format: str = png): retur
9、n frandom.choice(img_cdns)/trace_id?imageView2/format/formatdef get_img_urls_by_trace_id(trace_id: str, format: str = png): return fcdn/trace_id?imageView2/format/format for cdn in img_cdnsdef get_trace_id(img_url: str): return img_url.split(/)-1.split(!)0def get_imgs_url_from_note(note) - list: the
10、 return type is img1_url, img2_url, . imgs = noteimage_list if not len(imgs): return return get_img_url_by_trace_id(get_trace_id(imginfo_list0url) for img in imgsdef get_imgs_urls_from_note(note) - list: the return type is img1_url1, img1_url2, img1_url3, img2_url, img2_url2, img2_url3, . imgs = not
11、eimage_list if not len(imgs): return return get_img_urls_by_trace_id(imgtrace_id) for img in imgsvideo_cdns = https:/sns-video-, https:/sns-video-, https:/sns-video-, https:/sns-video-,def get_video_url_from_note(note) - str: if not note.get(video): return origin_video_key = notevideoconsumerorigin_
12、video_key return frandom.choice(video_cdns)/origin_video_keydef get_video_urls_from_note(note) - list: if not note.get(video): return origin_video_key = notevideoconsumerorigin_video_key return fcdn/origin_video_key for cdn in video_cdnsdef download_file(url: str, filename: str): with requests.get(u
13、rl, stream=True) as r: r.raise_for_status() with open(filename, wb) as f: for chunk in r.iter_content(chunk_size=8192): f.write(chunk)def get_valid_path_name(text): invalid_chars = :/|?* return re.sub(.format(re.escape(invalid_chars), _, text)def mrc(e): ie = 0, 1996959894, 3993919788, 2567524794, 1
14、24634137, 1886057615, 3915621685, 2657392035, 249268274, 2044508324, 3772115230, 2547177864, 162941995, 2125561021, 3887607047, 2428444049, 498536548, 1789927666, 4089016648, 2227061214, 450548861, 1843258603, 4107580753, 2211677639, 325883990, 1684777152, 4251122042, 2321926636, 335633487, 16613654
15、65, 4195302755, 2366115317, 997073096, 1281953886, 3579855332, 2724688242, 1006888145, 1258607687, 3524101629, 2768942443, 901097722, 1119000684, 3686517206, 2898065728, 853044451, 1172266101, 3705015759, 2882616665, 651767980, 1373503546, 3369554304, 3218104598, 565507253, 1454621731, 3485111705, 3
16、099436303, 671266974, 1594198024, 3322730930, 2970347812, 795835527, 1483230225, 3244367275, 3060149565, 1994146192, 31158534, 2563907772, 4023717930, 1907459465, 112637215, 2680153253, 3904427059, 2013776290, 251722036, 2517215374, 3775830040, 2137656763, 141376813, 2439277719, 3865271297, 18021954
17、44, 476864866, 2238001368, 4066508878, 1812370925, 453092731, 2181625025, 4111451223, 1706088902, 314042704, 2344532202, 4240017532, 1658658271, 366619977, 2362670323, 4224994405, 1303535960, 984961486, 2747007092, 3569037538, 1256170817, 1037604311, 2765210733, 3554079995, 1131014506, 879679996, 29
18、09243462, 3663771856, 1141124467, 855842277, 2852801631, 3708648649, 1342533948, 654459306, 3188396048, 3373015174, 1466479909, 544179635, 3110523913, 3462522015, 1591671054, 702138776, 2966460450, 3352799412, 1504918807, 783551873, 3082640443, 3233442989, 3988292384, 2596254646, 62317068, 195781084
19、2, 3939845945, 2647816111, 81470997, 1943803523, 3814918930, 2489596804, 225274430, 2053790376, 3826175755, 2466906013, 167816743, 2097651377, 4027552580, 2265490386, 503444072, 1762050814, 4150417245, 2154129355, 426522225, 1852507879, 4275313526, 2312317920, 282753626, 1742555852, 4189708143, 2394
20、877945, 397917763, 1622183637, 3604390888, 2714866558, 953729732, 1340076626, 3518719985, 2797360999, 1068828381, 1219638859, 3624741850, 2936675148, 906185462, 1090812512, 3747672003, 2825379669, 829329135, 1181335161, 3412177804, 3160834842, 628085408, 1382605366, 3423369109, 3138078467, 570562233
21、, 1426400815, 3317316542, 2998733608, 733239954, 1555261956, 3268935591, 3050360625, 752459403, 1541320221, 2607071920, 3965973030, 1969922972, 40735498, 2617837225, 3943577151, 1913087877, 83908371, 2512341634, 3803740692, 2075208622, 213261112, 2463272603, 3855990285, 2094854071, 198958881, 226202
22、9012, 4057260610, 1759359992, 534414190, 2176718541, 4139329115, 1873836001, 414664567, 2282248934, 4279200368, 1711684554, 285281116, 2405801727, 4167216745, 1634467795, 376229701, 2685067896, 3608007406, 1308918612, 956543938, 2808555105, 3495958263, 1231636301, 1047427035, 2932959818, 3654703836,
23、 1088359270, 936918000, 2847714899, 3736837829, 1202900863, 817233897, 3183342108, 3401237130, 1404277552, 615818150, 3134207493, 3453421203, 1423857449, 601450431, 3009837614, 3294710456, 1567103746, 711928724, 3020668471, 3272380065, 1510334235, 755167117, o = -1 def right_without_sign(num, bit=0)
24、 - int: val = ctypes.c_uint32(num).value bit MAX32INT = 4294967295 return (val + (MAX32INT + 1) % (2 * (MAX32INT + 1) - MAX32INT - 1 for n in range(57): o = ie(o & 255) ord(en) right_without_sign(o, 8) return o -1 3988292384lookup = Z, m, s, e, r, b, B, o, H, Q, t, N, P, +, w, O, c, z, a, /, L, p, n
25、, g, G, 8, y, J, q, 4, 2, K, W, Y, j, 0, D, S, f, d, i, k, x, 3, V, T, 1, 6, I, l, U, A, F, M, 9, 7, h, E, C, v, u, R, X, 5,def tripletToBase64(e): return ( lookup63 & (e 18) + lookup63 & (e 12) + lookup(e 6) & 63 + lookupe & 63 )def encodeChunk(e, t, r): m = for b in range(t, r, 3): n = (16711680 &
26、 (eb 16) + (eb + 1 8) & 65280) + (eb + 2 & 255) m.append(tripletToBase64(n) return .join(m)def b64Encode(e): P = len(e) W = P % 3 U = z = 16383 H = 0 Z = P - W while H Z else H + z) H += z if 1 = W: F = eP - 1 U.append(lookupF 2 + lookup(F 4) & 63 + =) elif 2 = W: F = (eP - 2 10 + lookup63 & (F 4) +
27、 lookup(F 2) & 63 + =) return .join(U)def encodeUtf8(e): b = m = urllib.parse.quote(e, safe=()*!.) w = 0 while w len(m): T = mw if T = %: E = mw + 1 + mw + 2 S = int(E, 16) b.append(S) w += 2 else: b.append(ord(T0) w += 1 return bdef base36encode(number, alphabet=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
28、): Converts an integer to a base36 string. if not isinstance(number, int): raise TypeError(number must be an integer) base36 = sign = if number 0: sign = - number = -number if 0 = number len(alphabet): return sign + alphabetnumber while number != 0: number, i = divmod(number, len(alphabet) base36 =
29、alphabeti + base36 return sign + base36def base36decode(number): return int(number, 36)def get_search_id(): e = int(time.time() * 1000) 64 t = int(random.uniform(0, 2147483646) return base36encode(e + t)def cookie_str_to_cookie_dict(cookie_str: str): cookie_blocks = cookie_block.split(=) for cookie_
30、block in cookie_str.split(;) if cookie_block return cookie0.strip(): cookie1.strip() for cookie in cookie_blocksdef cookie_jar_to_cookie_str(cookie_jar): cookie_dict = requests.utils.dict_from_cookiejar(cookie_jar) return ;.join(fkey=value for key, value in cookie_dict.items()def update_session_cook
31、ies_from_cookie(session: requests.Session, cookie: str): cookie_dict = cookie_str_to_cookie_dict(cookie) if cookie else if a1 not in cookie_dict or webId not in cookie_dict: # a1, web_id = get_a1_and_web_id() cookie_dict |= a1: 187d2defea8dz1fgwydnci40kw265ikh9fsxn66qs50000726043, webId: ba57f42593b
32、9e55840a289fa0b755374 if gid not in cookie_dict: cookie_dict |= gid.sign: PSF1M3U6EBC/Jv6eGddPbmsWzLI=, gid: yYWfJfi820jSyYWfJfdidiKK0YfuyikEvfISMAM348TEJC28K23TxI888WJK84q8S4WfY2Sy new_cookies = requests.utils.cookiejar_from_dict(cookie_dict) session.cookies = new_cookies 1 2 3 4 5 6 7 8 9 10 11 12
33、 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 10
34、9 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 18
35、4 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315