Designing data intensive application, reading notes, Part 1

Notes when reading chapter 2 “Data models and query languages”, chapter 3 “Storage and retrieval”

......

2017.05.04 lang-en book Database

4月周末杂记

月初的时候搬了家, 之后的周末一直在忙些琐琐碎碎的事情，嘛，仔细一想，除了去宜家搬了个电视柜回来都不记得干了啥…

这周末心血来潮去听了两个讲座，一个人文的，一个技术的，还都碰到了点有意思的事情。

......

2017.04.23

Infrastructure as Code

Create virtual resource on AWS is very convenient, but how to manage them will be a problem when your size grow.

You will come to:

How to explain the detail online settings for your colleagues (like: how our prod vpc is setup?what’s the DHCP option set?), navigate around AWS console is okay, but not convenient.
Who did what to which resource at when? AWS have a service called Config, can be used to track this change, but if you want to make things as clear as viewing git log, still a lot of works to do.

Ideally, we should manage AWS resources like code, all changes kept in VCS, so called Infrastructure as Code.

I’ve tried three ways to do it:

ansible
CloudFormation
terraform

In this article, I’ll compare them, however, the conclusion is to use terraform 🙂

Ansible

Provision tools, like ansible/chef/puppet, all can be used to create aws resources, but they have some common problems:

Hard to track changes after bootstrap.
No confident what it will do to existing resources.

For example, I define a security group in ansibble:

ec2_group:
  name: "web"
  description: "security group in web"
  vpc_id: "vpc-xxx"
  region: "us-east-1"
  rules:
    - proto: tcp
      from_port: 80
      to_port: 80
      cidr_ip: 0.0.0.0/0

It will create a security group named “web” in vpc-xxx. At first glance, it’s convenient and straightforward.

......

2017.04.21 lang-en AWS server-infra

Concurrency in Go, Reading Notes

A few notes taken when reading

......

2017.04.19 lang-en book golang

Overview

MySQL has buildin partition table support, which can help split data accross multi tables,

and provide a unified query interface as normal tables.

Benefit:

Easy data management: If we need to archive old data, and our table is partitioned by datetime, we can drop old partition directly.
Speed up query based on partition key(partitoin pruning)

Limit:

For partition table, every unique key must use every column in table’s partition expression(include primary key)
For innodb engine, paritioned table can’t have foreign key,and can’t have columns referenced by foreign keys.
For MyISAM engine, mysql version <= 5.6.5, DML operation will lock all partition as a whole.

......

2017.04.05 Database MySQL

ElasticSearch cluster

In this article, let’s talk about ElasticSearch’s cluster mode, which means multi nodes ElasticSearch.

Basic concepts

cluster: A collection of server nodes with same cluster.name settings in elasticsearch.yaml

primary shards: Divide a index into multi parts(by default 5), shards of an index can be distributed over multi nodes. It enables scale index horizontally and make access to index parallelly(accross multi nodes).

replicas: backup for shards, also replicas can handle search requests, which means you can scale your search capacity horizontally via replicas.

......

2017.03.22 distribute elasticsearch

Matrix 14 years later

心血来潮, 又看了遍黑客帝国三部曲, 当年的沃卓斯基兄弟都变成沃卓斯基姐妹了, 唏嘘啊…

第一次看的时候, 好像是初中吧, 记得看第三部还是姑父的盗版碟上看的, 那天还拉了个同学和我一起看,然后请他吃了泡面+冰淇淋,结果他回家就拉肚了,抱怨了我好久,所以印象特别深刻,哈哈.

......

2017.03.11

Bigtable notes

杂乱笔记，辅助读paper.

......

2016.12.11 bigtable distribute paper

GFS notes

看了下很久前 google 的 GFS 论文，做点笔记。

......

2016.11.19 distribute gfs paper

Migrate to encrypted RDS

最近公司在做 HIPAA Compliance 相关的事情，其中要求之一是所有db需要开启encryption.

比较麻烦的是rds 的encryption 只能在创建的时候设定，无法之后修改, 所以必须对线上的db 做一次 migration.

......

2016.10.28 AWS Database MySQL server-infra

Shining Moon

Designing data intensive application, reading notes, Part 1

4月周末杂记

Infrastructure as Code

Ansible

Concurrency in Go, Reading Notes

MySQL partition table

Overview

ElasticSearch cluster

Basic concepts

Matrix 14 years later

Bigtable notes

GFS notes

Migrate to encrypted RDS