百种弊病,皆从懒生

Get all invalid PTR record on Route53

I use autoscaling group to manage stateless servers. Servers go up and down every day. Once server is up, I will add a PTR record for it’s internal ip. But when it’s down, I didn’t cleanup the PTR record. As times fly, a lot of invalid PTR records left in Route53. To cleanup those PTR records realtime, you can write a lambda function, use server termination event as trigger. But how to cleanup the old records at once? Straightforward way is write a script to call AWS API to get a PTR list, get ip from record, test whether the ip is live, if not, delete it. Since use awscli to delete a Route53 record is very troublesome (involve json format), you’d better write a python script to delete them. I just demo some ideas to collect those records via shell. You can do it in a single line, but make things clear and easy to debug, I split it into several steps. Get PTR record list aws route53 list-resource-record-sets --hosted-zone-id xxxxx --query "ResourceRecordSets[?Type=='PTR'].Name" | grep -Po '"(.+?)"' | tr -d \" > ptr.txt ptr.txt will contain lines like: 1.0.0.10.in-addr.arpa. 2.0.0.10.in-addr.arpa. Get ip list from PTR records cat ptr.txt | while read -r line ; do echo -n $line | tac -s.......

Build private static website on S3

Build static website on S3 is very easy, but by default, it can be accessed by open internet.It will be super helpful if we can build website only available in VPC. Then we can use it to host internal deb repo, doc site… Steps are very easy, you only need VPC endpoints and S3 bucket policy. AWS api is open to internet, if you need to access S3 in VPC, your requests will pass through VPC’s internet gateway or NAT gateway. With VPC endpoints(can be found in VPC console), your requests to S3 will go through AWS’s internal network. Currently, VPC endpoints only support S3, support for dynamodb is in test. To restrict S3 bucket only available in your VPC, need to set bucket policy (to host static website, enable static website support first). At first, I didn’t check doc, try to restrict access by my VPC ip cidr, but it didn’t work, I need to restrict by VPC endpoint id: { "Version": "2012-10-17", "Id": "Policy1415115909152", "Statement": [ { "Sid": "Access-to-specific-VPCE-only", "Principal": "*", "Action": "s3:GetObject", "Effect": "Allow", "Resource": ["arn:aws:s3:::my_secure_bucket", "arn:aws:s3:::my_secure_bucket/*"], "Condition": { "StringEquals": { "aws:sourceVpce": "vpce-1a2b3c4d" } } } ] } BTW, if you can config bucket policy restrict on VPC directly, with VPC endpoint you can limit to subnets.......

旅行散记

前阵子总有点心烦意乱的,生活上的,家里的,堵在一块,弄得自己都有点疲惫了,8月初的时候去日本东北逛了一圈,恰逢当地祭奠密集期,也就凑了把热闹,还挺有意思。 行程: 上海 -> 东京 -> 盛冈 -> 八户 -> 十和田湖 -> 青森 -> 弘前 -> 东京 -> 上海, 满满当当的7天, 懒得贴图,瞎记一点。 盛冈是岩手县的首府,但刚到的时候感觉真是个大乡下啊,大白天,出了车站区域,步行1公里多的时间里只碰到了2个人,一个桥下睡觉的大叔,一个遛狗的,像个......

Use redshift spectrum to do query on s3

使用 redshift spectrum 查询 S3 数据 通常使用 redshift 做数据仓库的时候要做大量的 ETL 工作,一般流程是把各种来源的数据捣鼓捣鼓丢到 S3 上去,再从 S3 倒腾进 redshift. 如果你有大量的历史数据要导进 redshift,这个过程就会很痛苦,redshift 对一次倒入大量数据并不友好,你要分批来做。 今年4月的时候, redshift 发布了一个新功能 spectrum, 可以从 redshift 里直接查询 s3 上的结构化数据。最近把部分数据仓库直接迁移到了 spectrum, 正好来讲讲。 动机 Glow 的数据仓库建在 redshift 上, 又分成了两个......

Enable coredump on ubuntu 16.04

Coredump file is useful for debuging program crash. This post will show several settings related to coredump. Enable coredump If you run program from shell , enable coredump via unlimit -c unlimited, then check unlimit -a | grep core, if it shows unlimited, coredump is enabled for your current session. If your program is hosted by systemd, you need to edit your program’s service unit file’s [Service] section, add LimitCORE=infinity to enable coredump. coredump location Coredump file’s location is determined by kernerl parameter kernel.core_pattern. On ubuntu 16.04 kernel.core_pattern default value is |/usr/share/apport/apport %p %s %c %P. Leading | means pass coredump file to following program. %p %c %P is used to create dump filename, their meaning can check via man core. apport will save dump file in /var/crash If your default disk partition don’t have enough space to hold dump file, you can change kernel.core_pattern to another location, eg: /mnt/core/%e-%t.%P. If redis-server crashes, the dump file will be something like /mnt/core/redis-server-1500000000.1452. Also ensure crash process’s running user have write privilege on target location. systemd-coredump You can install systemd-coredump to control dump file deeply, like: size, compression…. Its config file is /etc/systemd/coredump.conf. After install, it will change kernel.core_pattern to |/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %e.......

Python Web 应用性能调优

Python web 应用性能调优 为了快速上线,早期很多代码基本是怎么方便怎么来,这样就留下了很多隐患,性能也不是很理想,python 因为 GIL 的原因,在性能上有天然劣势,即使用了 gevent/eventlet 这种协程方案,也很容易因为耗时的 CPU 操作阻塞住整个进程。前阵子对基础代码做了些重构,效果显著,记录一些。 设定目标: 性能提高了,最直接的效果当然是能用更少的机器处理相同流量,目标是关闭 20% 的 stateless webserver. 尽量在框架代码上做改动,不动业务逻辑代码。 低风险 (历......

Build deb repository with fpm , aptly and s3

I’m lazy, I don’t want to be deb/rpm expert, I don’t want to maintain repo server. I want as less maintenance effort as possible. 🙂 Combine tools fpm, aptly with aws s3, we can do it. Use fpm to convert python package to deb fpm can transform python/gem/npm/dir/… to deb/rpm/solaris/… packages Example: fpm -s python -t deb -m [email protected] --verbose -v 0.10.1 --python-pip /usr/local/pip Flask It will transform Flask 0.10.1 package to deb. Output package will be python-flask_0.10.1_all.deb Notes: If python packages rely on some c libs like MySQLdb (libmysqlclient-dev), you need to install them on the machine to build deb binary. By default fpm use easy_install to build packages, some packages like httplib2 have permission bug with easy_install, so I use pip By default, msgpack-python will be convert to python-msgpack-python, I don’t like it, so add -n python-msgpack to normalize the package name. Some package’s dependencies’ version number is not valid(eg: celery 3.1.25 deps pytz >= dev), so I replace the dependencies with --python-disable-dependency pytz -d 'pytz >= 2016.7' fpm will not dowload package’s dependency automatically, you need to do it by your self Use aptly to setup deb repository aptly can help build a self host deb repository and publish it on s3.......

Debug python performance issue with pyflame

pyflame is an opensource tool developed by uber: https://github.com/uber/pyflame It can take snapshots of running python process, combined with flamegraph.pl, can output flamegraph picture of python call stacks. Help analyze bottleneck of python program, needn’t inject any perf code into your application, and overhead is very low. Basic Usage sudo pyflame -s 10 -x -r 0.001 $pid | ./flamegraph.pl > perf.svg -s, how many seconds to run -r, sample rate (seconds) Your output will be something like following: Longer bar means more sample points located in it, which means this part code is slower so it has a higher chance seen by pyflame. In my case, the output graph has a long IDLE part. Pyflame can detect call stacks who are holding GIL, so if the running code doesn’t hold GIL, pyflame don’t know what it’s doing, it will label them as IDLE. Following cases will be considered as IDLE: Your process is sleeping, do nothing. Waiting for IO.(eg: Your application is calling a very slow RPC server) Call libs written in C. The right part is real application logic code. You can check this part to get a sense of overview performance of your code. Also you can exclude the IDLE part from graph if you don’t care about them, just apply -x......

Designing data intensive application, reading notes, Part 2

Chapter 4, 5, 6 Encoding formats xml, json, msgpack are text based encoding format, they can’t carry binary bytes (useless you encode them in base64, size grows 33%). And they cary schema definition with data, wast a lot of space. thrift, protobuf are binary format, can take binary bytes, only carry data, the schema is defined with IDL(interface definition language). They have code generation tool to generate code to encode and decode data, along with check. Every field of data is binded with a tag(mapped to a field in IDL file). If a field is defined is required, it can’t by removed or change tag value, otherwise old code will not be able to decode it. avro (used in hadoop), have a write schema and a read schema, when store a large file in avro format(contain many records with same schema), the avro write schama file is appended to the data. If use avro in RPC, the avro schema is exchanged during connection setup. When decoding avro, the lib will look both write schema and read schema, and translate write schema into read schema. Forward compatibility means that you can have a new version of the schema as writer and an old version of the schema as reader, backward compatibility means that you can have a new version of the schema as reader and an old version as writer.......

Designing data intensive application, reading notes, Part 1

Notes when reading chapter 2 “Data models and query languages”, chapter 3 “Storage and retrieval”

......