百种弊病,皆从懒生

eks 评测 part-2

上文测试了一下 EKS 和 cluster autoscaler, 本文记录对 persisten volume 的测试. PersistentVolume 创建 gp2 类型的 storageclass, 并用 annotations 设置为默认 sc, dynamic volume provision 会用到: kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: gp2 annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: kubernetes.io/aws-ebs reclaimPolicy: Retain parameters: type: gp2 fsType: ext4 encrypted: "true" 因为 eks 是基于 1.10.3 的, volume expansion 还是 alpha 状态, 没法自动开启(没法改 api server 配置), 所以 storageclass 的 allowVolumeExpansion, 设置了也没用. 这里 encrypted 的值必须是字符串, 否则会创建失败, 而且报错莫名其妙. 创建 pod 的时候指定一个已存在的 ebs volume apiVersion: v1 kind: Pod metadata: name: test spec: volumes: - name: test awsElasticBlockStore: fsType: ext4 volumeID: vol-03670d6294ccf29fd containers: - image: nginx name: nginx volumeMounts: - name: test mountPath: /mnt kubectl -it test -- /bin/bash 进去看一下: root@test:/# df -h......

EKS 评测

EKS 正式 launch 还没有正经用过, 最近总算试了一把, 记录一点. Setup AWS 官方的 Guide 只提供了一个 cloudformation template 来设置 worker node, 我喜欢用 terraform, 可以跟着这个文档尝试:https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html 来设置完整的 eks cluster 和管理 worker node 的 autoscaling group. 设置完 EKS 后需要添加一条 ConfigMap: apiVersion: v1 kind: ConfigMap metadata: name: aws-auth namespace: kube-system data: mapRoles: | - rolearn: arn:aws:iam::<account-id>:role/eksNodeRole username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes 这样 worker node 节点才能加入集群. 网......

Kubernetes in Action Notes

Miscellaneous notes when reading <Kubernetes in Action>. api group and api version core api group need’t specified in apiVersion field. For example, ReplicationController is on core api group, so only: apiVersion: v1 kind: ReplicationController ... ReplicationSet is added later in app group, v1beta2 version (k8s v1.8): apiVersion: apps/v1beta2 1 kind: ReplicaSet https://kubernetes.io/docs/concepts/overview/kubernetes-api/ ReplicationController VS ReplicationSet ReplicationController is replaced by ReplicationSet, which has more expressive pod selectors. ReplicationController’s label selector only allows matching pods that include a certain label, ReplicationSet can meet multi labels at same time. rs also support operator on key value: In, NotIn, Exists, DoesNotExist If migrate from rc to rs, can delete rc with --cascade=false option, it will delete rc only, but left pods running, then we can create a rs with same selector to make pods under management. DaemonSet DaemonSet ensures pod run exact one copy on one node, useful for processes like monitor agent and log collector. Use node-selector to make ds only run on specific nodes. If node is made unschedulable, normal pods won’t be scheduled to deploy on them, but ds will still be deployed to it, since ds will bypass scheduler. Job Job is used to run a single completable task. Use activeDeadlineSeconds to control job timeout.......

杂谈

忙了好一阵, 两个月没写了, 工作上的事告一段落, 也该补上这笔帐了, 老规矩, 随便写写 :) 最近在做什么? 把公司的代码环境从 python 2.7 升级到 python3.6, 前后忙了 3 个月, 50w 行代码, 也是够呛, 好歹算是顺利完成了, 具体的过程6月零散写过 几篇文章, 大差不差, 后续又碰了不少坑, 但也都能解决. 下一步打算基于 python3 的一些特性对代码做些重构, 从基础库开始吧. 最近看了什么书? <特别的猫>, 特别喜欢的一本书, 不是那种猫奴一昧赞美猫咪多......

升级celery 到 4.2.0 碰到的坑

在把代码往 python3 迁移的过程中需要升级一些第三方库, 升级了 gevent 后发现 celery 有问题, 于是尝试把 celery 从3.1.25 升级到 4.2.0, 中间碰到了很多问题, 记录一点. 配置的变化 CELERY_ACCEPT_CONENT 之前默认是都允许的, 4.0 开始默认值只允许 json, 因为我用的是msgpack, 所以需要修改这个配置让它接受 msgpack. CELERY_RESULT_SERIALIZER 之前默认是pickle, 现在默认也变成了json, 如果task 的返回结果是 binary 的话, json 无法处理,要么把结果 base64 编码, 要么把CELERY_RESULT_SERI......

编写 python 2/3 兼容代码

上一篇 里简单得提了一点开始做 python 2 到 python3 迁移时候碰到的问题, 和工具的选择(推荐用 six).这篇讲下编写 python 2 / 3 兼容代码要注意的事情. _future_ python2 里自带的向后兼容模块,将 python3 的一些语法行为 backport 到 python2 里, 使用的时候需要在文件头部声明, 作用域只在当前文件. 首先是几个在 python 2.7 里不用特意写,已经默认开启的特性: from __future__ import nested_scopes 2.2 开始就默认开启了,用于修改嵌套函数内的变量搜索作用域, 在此之前, 全局模块的优先级比被嵌套函数的父函数要高, 现......

杂谈松本清张

去年在 kindle 上买了套松本清张的合集, 总共有10本, 断断续续看到现在终于看完了第 9 本<隐花平原>, 随便扯一点(喂,为什么不看完最后一本! 松本清张 (1909 ~ 1992), 社会派推理开创人, 这套书里的作品各个年代都有, 最有名的<砂之器>(貌似仲间姐姐拍过剧?) 和 <点与线> 却没收录. 本格派讲究逻辑的精巧, 整个故事就像在玩密室逃脱一样, 一环扣一环, 最后谜底揭开的时候让人惊呼"卧槽&q......

From python2 to python3

This article won’t provide perfect guide for porting py2 code to py3, just list the solutions I tried, the problems I come to, and my choices. I haven’t finished this project, also I haven’t gave up so far :). Won’t explain too much about the differences between py2 and py3, will write down some corner cases which are easy to miss. The codebase I’m working on: Only support python2.7, don’t consider python2.6 1X repos, about half a million lines of code in total (calculated by cloc). These repos will import each other, bad design from early days, not easy to resolve, which means I can’t switch to py3 one by one, I need write py2/3 compatiblility code for them, and switch together(I’m also considering solve the import problem first). Test coverage is not good, best is around 80%, lowest is 30%. Tools 2to3, a command line tools packaged with py2, it’s a oneway porting to convert your code to py3, new code won’t work under py2, since I need be compatible with py2 and py3 for long time, didn’t try it. future, it tries to make you write single clean python3.x code without ugly hack with six. I used it it first, but come to many problems, will explain later.......

在python3.7 中实现python2.7 的内置 hash 函数

最近着手准备从 python2.7 迁移到 python3.7, 还没开始就碰到一个问题. 老系统里有一部分竟然是将 python 内置 hash 函数的结果存进了数据库, 这个做法绝对是错的, hash 的结果本来就没有保证过在各个版本的 python 中保证一致. 而且 python3 中算法完全变了, 默认在进程初始化的时候会用随机种子加进 hash 过程, 所以python 进程 一重启结果就不一样了. 木已成舟, 目前看将数据库里的值全部改掉是不可能了, 只能在 python3 中重新实现一下这个算法. python2.7 中的hash 算法是 fnv (有修改),......

Use SNS & SQS to build Pub/Sub System

Recently, we build pub/sub system based on AWS’s SNS & SQS service, take some notes. Originally, we have an pub/sub system based on redis(use BLPOP to listen to a redis list). It’s really simple, and mainly for cross app operations. Now we have needs to enhance it to support more complex pubsub logic, eg: topic based distribution. It don’t support redelivery as well, if subscribers failed to process the message, message will be dropped. There’re three obvious choices in my mind: kafka AMQP based system (rabbitmq,activemq …) SNS + SQS My demands for this system are: Support message persistence. Support topic based message distribution. Easy to manage. The data volume won’t be very large, so performance and throughput won’t be critical concerns. I choose SNS + SQS, main concerns are from operation side: kafka need zookeeper to support cluster. rabbitmq need extra configuration for HA, and AMQP model is relatively complex for programming. So my decision is: application publish message to SNS topic Setup multi SQS queues to subscribe SNS topic Let different application processes to subscribe to different queues to finish its logic. SQS and SNS is very simple, not too much to say, just some notes: SQS queue have two types, FIFO queue and standard queue.......